Open RNAer opened 10 years ago
Which specific classes/functions are you thinking of from cogent?
I just checked cogent/struct/rna2d.py, not exactly what I think of.
I am thinking of a RNAStructure class that has RNASequence and a structure string as variable members, and has functions of dissecting structures, calcuating deltaG, and more.
@squirrelo, what do you think?
On Mon, Jun 16, 2014 at 8:20 PM, Greg Caporaso notifications@github.com wrote:
Which specific classes/functions are you thinking of from cogent?
— Reply to this email directly or view it on GitHub https://github.com/biocore/scikit-bio/issues/456#issuecomment-46260669.
What about adding the structure info to RNASequence
(making it optional of course)?
Or making a StructuredRNASequence object?
On Tue, Jun 17, 2014 at 6:31 AM, Jai Ram Rideout notifications@github.com wrote:
What about adding the structure info to RNASequence (making it optional of course)?
Reply to this email directly or view it on GitHub https://github.com/biocore/scikit-bio/issues/456#issuecomment-46306517.
I like the idea of a StructuredRNASequence object, although I'm not sure about the dG calculations or anything that would be done by an outside source, e.g. vienna, being part of the actual object and not a separate wrapper. I've already come across the need to do tree edit distance and things like that between RNA structures, so if we do have this object and can implement things like digesting the structure to tree form, or other completely python manipulations, that could be cool.
There are cases where you want the structure independent of a sequence eg for counting distinct structures or for designing a sequence to fit a specified structure. Also remember that 1 sequence can have many structures. In fact one of the key flaws of rnaml was nesting structure within sequence: you could not have a table of data for x sequences by y shared structures without repeating the structure data for each sequence.
Rob
On Jun 17, 2014, at 7:42 AM, "Greg Caporaso" notifications@github.com<mailto:notifications@github.com> wrote:
Or making a StructuredRNASequence object?
On Tue, Jun 17, 2014 at 6:31 AM, Jai Ram Rideout notifications@github.com<mailto:notifications@github.com> wrote:
What about adding the structure info to RNASequence (making it optional of course)?
Reply to this email directly or view it on GitHub https://github.com/biocore/scikit-bio/issues/456#issuecomment-46306517.
— Reply to this email directly or view it on GitHubhttps://github.com/biocore/scikit-bio/issues/456#issuecomment-46307862.
Ok, thanks Rob, that's a really good point. So that argues for a separate class that has an optional list/array of RNASequence(s). The assumption you'd make about an instance of this object is that the one or more structures are associated with the one or more sequences described by the instance. Is that right?
On Tue, Jun 17, 2014 at 7:32 AM, Rob Knight notifications@github.com wrote:
There are cases where you want the structure independent of a sequence eg for counting distinct structures or for designing a sequence to fit a specified structure. Also remember that 1 sequence can have many structures. In fact one of the key flaws of rnaml was nesting structure within sequence: you could not have a table of data for x sequences by y shared structures without repeating the structure data for each sequence.
Rob
On Jun 17, 2014, at 7:42 AM, "Greg Caporaso" <notifications@github.com mailto:notifications@github.com> wrote:
Or making a StructuredRNASequence object?
On Tue, Jun 17, 2014 at 6:31 AM, Jai Ram Rideout <notifications@github.com mailto:notifications@github.com> wrote:
What about adding the structure info to RNASequence (making it optional of course)?
Reply to this email directly or view it on GitHub https://github.com/biocore/scikit-bio/issues/456#issuecomment-46306517.
Reply to this email directly or view it on GitHub< https://github.com/biocore/scikit-bio/issues/456#issuecomment-46307862>.
Reply to this email directly or view it on GitHub https://github.com/biocore/scikit-bio/issues/456#issuecomment-46314487.
Yes, see the BayesFold code (which the pycogent code is based on) for examples.
On Jun 17, 2014, at 8:41 AM, "Greg Caporaso" notifications@github.com<mailto:notifications@github.com> wrote:
Ok, thanks Rob, that's a really good point. So that argues for a separate class that has an optional list/array of RNASequence(s). The assumption you'd make about an instance of this object is that the one or more structures are associated with the one or more sequences described by the instance. Is that right?
On Tue, Jun 17, 2014 at 7:32 AM, Rob Knight notifications@github.com<mailto:notifications@github.com> wrote:
There are cases where you want the structure independent of a sequence eg for counting distinct structures or for designing a sequence to fit a specified structure. Also remember that 1 sequence can have many structures. In fact one of the key flaws of rnaml was nesting structure within sequence: you could not have a table of data for x sequences by y shared structures without repeating the structure data for each sequence.
Rob
On Jun 17, 2014, at 7:42 AM, "Greg Caporaso" notifications@github.com<mailto:notifications@github.com mailto:notifications@github.com> wrote:
Or making a StructuredRNASequence object?
On Tue, Jun 17, 2014 at 6:31 AM, Jai Ram Rideout notifications@github.com<mailto:notifications@github.com mailto:notifications@github.com> wrote:
What about adding the structure info to RNASequence (making it optional of course)?
Reply to this email directly or view it on GitHub https://github.com/biocore/scikit-bio/issues/456#issuecomment-46306517.
Reply to this email directly or view it on GitHub< https://github.com/biocore/scikit-bio/issues/456#issuecomment-46307862>.
Reply to this email directly or view it on GitHub https://github.com/biocore/scikit-bio/issues/456#issuecomment-46314487.
— Reply to this email directly or view it on GitHubhttps://github.com/biocore/scikit-bio/issues/456#issuecomment-46315856.
Thanks @rob-knight @squirrelo for these details! Agree that a separate class makes sense.
So that argues for a separate class that has an optional list/array of RNASequence(s)
Could this be a SequenceCollection
of RNASequence
(s)? AFAIK SequenceCollection
s are allowed to be empty.
Yes, I think we'd want that to be a SequenceCollection.
On Tue, Jun 17, 2014 at 8:41 AM, Jai Ram Rideout notifications@github.com wrote:
Thanks @rob-knight https://github.com/rob-knight @squirrelo https://github.com/squirrelo for these details! Agree that a separate class makes sense.
So that argues for a separate class that has an optional list/array of RNASequence(s)
Could this be a SequenceCollection of RNASequence(s)? AFAIK SequenceCollections are allowed to be empty.
Reply to this email directly or view it on GitHub https://github.com/biocore/scikit-bio/issues/456#issuecomment-46324753.
good point. it is many-to-many relationship. One seq can have multiple structures and a structure can be folded from multiple sequences.
So how about a base structure class (which is corresponding to BiologicalSequence class), only containing one structure and start from there?
Another question is what the format should we use to represent secondary structure, dot-bracket, ct, etc? I really want psudoknot support.
On Tue, Jun 17, 2014 at 9:51 AM, Greg Caporaso notifications@github.com wrote:
Yes, I think we'd want that to be a SequenceCollection.
On Tue, Jun 17, 2014 at 8:41 AM, Jai Ram Rideout <notifications@github.com
wrote:
Thanks @rob-knight https://github.com/rob-knight @squirrelo https://github.com/squirrelo for these details! Agree that a separate
class makes sense.
So that argues for a separate class that has an optional list/array of RNASequence(s)
Could this be a SequenceCollection of RNASequence(s)? AFAIK SequenceCollections are allowed to be empty.
Reply to this email directly or view it on GitHub <https://github.com/biocore/scikit-bio/issues/456#issuecomment-46324753 .
— Reply to this email directly or view it on GitHub https://github.com/biocore/scikit-bio/issues/456#issuecomment-46326162.
These decisions are already explored exhaustively in the cogent versions. If you want arbitrary pseudo knot support you need to do it with a connect list. I have cc:ed Sandra who may be willing to talk you through some of the design decisions (and who used this code to work on pseudo knots previously).
Rob
On Jun 17, 2014, at 11:27 AM, Zech Xu notifications@github.com<mailto:notifications@github.com> wrote:
good point. it is many-to-many relationship. One seq can have multiple structures and a structure can be folded from multiple sequences.
So how about a base structure class (which is corresponding to BiologicalSequence class), only containing one structure and start from there?
Another question is what the format should we use to represent secondary structure, dot-bracket, ct, etc? I really want psudoknot support.
On Tue, Jun 17, 2014 at 9:51 AM, Greg Caporaso notifications@github.com<mailto:notifications@github.com> wrote:
Yes, I think we'd want that to be a SequenceCollection.
On Tue, Jun 17, 2014 at 8:41 AM, Jai Ram Rideout notifications@github.com<mailto:notifications@github.com
wrote:
Thanks @rob-knight https://github.com/rob-knight @squirrelo https://github.com/squirrelo for these details! Agree that a separate
class makes sense.
So that argues for a separate class that has an optional list/array of RNASequence(s)
Could this be a SequenceCollection of RNASequence(s)? AFAIK SequenceCollections are allowed to be empty.
Reply to this email directly or view it on GitHub <https://github.com/biocore/scikit-bio/issues/456#issuecomment-46324753 .
— Reply to this email directly or view it on GitHub https://github.com/biocore/scikit-bio/issues/456#issuecomment-46326162.
— Reply to this email directly or view it on GitHubhttps://github.com/biocore/scikit-bio/issues/456#issuecomment-46338636.
I will need these soon. I can check the cogent code and add structure classes/functions into skbio. do we want it as a module under
skbio
or something else?