GMOD / jbrowse

JBrowse 1, a full-featured genome browser built with JavaScript and HTML5. For JBrowse 2, see https://github.com/GMOD/jbrowse-components.
http://jbrowse.org
Other
461 stars 199 forks source link

"start" attribute in last column of GFF3 file results in bad coordinates for the feature in GFF3Tabix mode #1364

Closed loraine-gueguen closed 5 years ago

loraine-gueguen commented 5 years ago

We have such a GFF3 file:

myspecies-F_contig1   Gmove   mRNA    50689   50902   .       +       .       ID=mRNA.myspecies-F_contig1.8.1;Name=mRNA.myspecies-F_contig1.8.1;start=0;stop=0;cds_size=213;model_size=214;exons=1
myspecies-F_contig1   Gmove   CDS     50689   50901   .       +       .       Parent=mRNA.myspecies-F_contig1.8.1

case 1- In GFF3Tabix mode, this GFF3 results in wrong coordinates for the mRNA:

position_wrong

case 2- Removing the "start" and "stop" attributes in last column results in right coordinates:

position_right

case 3- In NCList mode, keeping the original GFF3, results in right coordinates, with a "start2" attribute displayed in the popup window:

nclist_position_right

I guess, in case 1, the "start" attribute in last column of the GFF3 is used as the start coordinate for the feature sequence whereas it should not.

abretaud commented 5 years ago

Ah, just had the same problem today, I ended up removing the start/stop attributes from GFF. I guess it's the tabix indexing that interprets badly the start attributes

loraine-gueguen commented 5 years ago

Yes I think so

cmdcolin commented 5 years ago

technically this is due to the jbrowse feature model. JBrowse has the "SimpleFeature" type that just does

feature.get('start') to refer to the column 3, and then it ends up overwriting that with the attribute named start

We would have to emulate the behavior of NCLIst e.g. renaming it 'start2' to make it work

cmdcolin commented 5 years ago

added my proposed fix to the dev branch. hope that is acceptable!