Closed cmdcolin closed 6 years ago
Another consideration is that gene level annotations can contain both processed transcript children and non coding children. The gene glyph could be extended to handle this. Currently it just displays a box for the non coding children, but it could be coded to display the "segments" glyph (and maybe color it differently)
Some basic code for this here https://github.com/GMOD/jbrowse/tree/unprocessed_transcript_glyph
Weird corner case is that the style.color of the higher level feature is having trouble overriding the lower level box glyphs goldenrod
Other types of things you might see in an NCBI GFF that make JBrowse glyphs turn out bad:
"gene" feature with 0 subfeatures
gunzip -c GRCh38_latest_genomic.gff.gz|grep PFN1P10
NC_000001.11 Curated Genomic gene 21459424 21460202 . - . ID=gene632;Dbxref=GeneID:767853,HGNC:HGNC:42985;Name=PFN1P10;description=profilin 1 pseudogene 10;gbkey=Gene;gene=PFN1P10;gene_biotype=pseudogene;pseudo=true
"gene" feature with direct exon children
gunzip -c GRCh38_latest_genomic.gff.gz|grep CROCCP5
NC_000001.11 Curated Genomic gene 21434320 21436826 . + . ID=gene630;Dbxref=GeneID:100421114,HGNC:HGNC:43865;Name=CROCCP5;description=ciliary rootlet coiled-coil%2C rootletin pseudogene 5;gbkey=Gene;gene=CROCCP5;gene_biotype=pseudogene;pseudo=true
NC_000001.11 Curated Genomic exon 21434320 21434540 . + . ID=id26949;Parent=gene630;Dbxref=GeneID:100421114,HGNC:HGNC:43865;gbkey=exon;gene=CROCCP5
NC_000001.11 Curated Genomic exon 21435298 21435394 . + . ID=id26950;Parent=gene630;Dbxref=GeneID:100421114,HGNC:HGNC:43865;gbkey=exon;gene=CROCCP5
NC_000001.11 Curated Genomic exon 21436067 21436150 . + . ID=id26951;Parent=gene630;Dbxref=GeneID:100421114,HGNC:HGNC:43865;gbkey=exon;gene=CROCCP5
NC_000001.11 Curated Genomic exon 21436576 21436826 . + . ID=id26952;Parent=gene630;Dbxref=GeneID:100421114,HGNC:HGNC:43865;gbkey=exon;gene=CROCCP5
The issues of genes without subfeatures was actually fixed by something that @rbuels made (in release notes @rbuels mentioned "Fixed a bug in which feature labels would sometimes be repeated across the view, in the wrong locations")
The idea of the non-coding transcripts is now implemented much better by this now. The idea of receiving GFF with "pseudogene->pseudotranscript->pseudoexon" or something similar with sequence ontology correctness like this is still a little unclear but if needed we can make a new issue
The issue discussed in #1075 highlighted that representing pseudogenes is a little tricky
Especially with a track that has both genes and pseudogenes, it would be good to dispatch to the Gene glyph for gene feature types and a Psuedogene glyph for pseudogene features in CanvasFeatures, because pseudogenes don't share the same structure of having CDS for example so their structure is not captured well