Closed olgabot closed 2 years ago
Hmm. Currently I'm using levels 1 and 2 in the relations table. The relations table where level = 1 is what defines the DAG (direction encoded by which is "parent" and which is "child"). The entries where level = 2 are essentially a shortcut to avoid traversing the DAG when doing the domain-specific task of getting exons of a gene. But those 1s and 2s are nominal rather than ordinal values -- things would work the same if I had called them "a" and "b". So you might be able to get away with adding a level 3 to indicate splicing events if it was useful to sort of step outside the hierarchy.
Or another angle: your graph shows CDSs as children of exons but in most GFF/GTF files I've seen, CDSs are siblings of exons and both are children of transcripts. So there is already support for e.g. multiple level-1 children. I think the last 2 rows of this table are all you have to add:
parent | child | level |
---|---|---|
gene | transcript | 1 |
transcript | exon | 1 |
transcript | cds | 1 |
transcript | splice | 1 |
splice | exon | 1 |
I think this should work at least in principle. Can you provide a small example that gives you the error you're seeing?
Closing, but please reopen if this continues to be a need and gffutils doesn't support it.
okay lots of questions from me ...
I'd like to add another layer of a
splicing_event
feature to an existing gffutils database, that is between thetranscript
andexon
level. The reason for this is because I want to annotate alternative splicing events with their protein sequences and such that is hard to obtain when you have the events on their own, but is much easier when the events are integrated with the transcripts, exons, and CDSs.Here is a drawing of what I mean.
The
splicing_event
feature type is a first-level child of thetranscript
and a first-level parent of theexon
. So then would exons become second-level children of the transcripts? But what would happen to exons that aren't in a splicing event? Would there need to be addition of a "phantom" splicing event to make sure there's still something that exists on that level?I tried doing this using an existing database and got this error:
sqlite3.IntegrityError: UNIQUE constraint failed: relations.parent, relations.child, relations.level
, which I take it is because I'm disrupting the level 1 relationship of the exons to the transcripts by inserting this extra event.What would be the "right" way to do this?