GMOD / Apollo3

JBrowse 2 plugin for editing annotations on an Apollo server
Apache License 2.0
6 stars 4 forks source link

Standardize gene representation #381

Open garrettjstevens opened 2 months ago

garrettjstevens commented 2 months ago

Up until now we've basically preserved exactly what is in the GFF3 that is imported, with only a bit of formatting changes to store it internally. However, this has led to some places in our code that handle things differently based on how the GFF3 is formatted. A big example is the CanonicalGeneGlyph and the ImplicitExonGeneGlyph. There have also been GFF3s that we've tried uploading where neither of these glyphs work. I also noticed this behavior when looking at the Transcript Details Widget, certain things only worked if the GFF3 was formatted in a certain way.

I think the way we need to handle this going forward is to standardize the GFF3 data on import, specifically for genes, so Apollo can always expect a single format. This means a potential loss of data. For example, if a five_prime_UTR in a GFF3 has an ID, but we decide to drop UTRs from the data when standardizing it (since the location of UTRs can be calculated based on the locations of other features), we'd lose the UTR's ID. I think this is unavoidable, though, and can also be somewhat mitigated by having a robust GFF3 export system.

Here are some GFF3s that I found that illustrate how GFF3s format genes:

Sequence Ontology GFF3 Spec

See GFF3 ``` ##gff-version 3.1.26 ##sequence-region ctg123 1 1497228 ctg123 . gene 1000 9000 . + . ID=gene00001;Name=EDEN ctg123 . TF_binding_site 1000 1012 . + . ID=tfbs00001;Parent=gene00001 ctg123 . mRNA 1050 9000 . + . ID=mRNA00001;Parent=gene00001;Name=EDEN.1 ctg123 . mRNA 1050 9000 . + . ID=mRNA00002;Parent=gene00001;Name=EDEN.2 ctg123 . mRNA 1300 9000 . + . ID=mRNA00003;Parent=gene00001;Name=EDEN.3 ctg123 . exon 1300 1500 . + . ID=exon00001;Parent=mRNA00003 ctg123 . exon 1050 1500 . + . ID=exon00002;Parent=mRNA00001,mRNA00002 ctg123 . exon 3000 3902 . + . ID=exon00003;Parent=mRNA00001,mRNA00003 ctg123 . exon 5000 5500 . + . ID=exon00004;Parent=mRNA00001,mRNA00002,mRNA00003 ctg123 . exon 7000 9000 . + . ID=exon00005;Parent=mRNA00001,mRNA00002,mRNA00003 ctg123 . CDS 1201 1500 . + 0 ID=cds00001;Parent=mRNA00001;Name=edenprotein.1 ctg123 . CDS 3000 3902 . + 0 ID=cds00001;Parent=mRNA00001;Name=edenprotein.1 ctg123 . CDS 5000 5500 . + 0 ID=cds00001;Parent=mRNA00001;Name=edenprotein.1 ctg123 . CDS 7000 7600 . + 0 ID=cds00001;Parent=mRNA00001;Name=edenprotein.1 ctg123 . CDS 1201 1500 . + 0 ID=cds00002;Parent=mRNA00002;Name=edenprotein.2 ctg123 . CDS 5000 5500 . + 0 ID=cds00002;Parent=mRNA00002;Name=edenprotein.2 ctg123 . CDS 7000 7600 . + 0 ID=cds00002;Parent=mRNA00002;Name=edenprotein.2 ctg123 . CDS 3301 3902 . + 0 ID=cds00003;Parent=mRNA00003;Name=edenprotein.3 ctg123 . CDS 5000 5500 . + 1 ID=cds00003;Parent=mRNA00003;Name=edenprotein.3 ctg123 . CDS 7000 7600 . + 1 ID=cds00003;Parent=mRNA00003;Name=edenprotein.3 ctg123 . CDS 3391 3902 . + 0 ID=cds00004;Parent=mRNA00003;Name=edenprotein.4 ctg123 . CDS 5000 5500 . + 1 ID=cds00004;Parent=mRNA00003;Name=edenprotein.4 ctg123 . CDS 7000 7600 . + 1 ID=cds00004;Parent=mRNA00003;Name=edenprotein.4 ```

Ensembl GRCh38

See GFF3 ``` ##gff-version 3.1.26 ##sequence-region ctg123 1 1497228 ##gff-version 3 ##sequence-region 19 1 58617616 #!genome-build Genome Reference Consortium GRCh38.p14 #!genome-version GRCh38 #!genome-date 2013-12 #!genome-build-accession GCA_000001405.29 #!genebuild-last-updated 2023-07 19 ensembl_havana gene 44905791 44909393 . + . ID=gene:ENSG00000130203;Name=APOE;biotype=protein_coding;description=apolipoprotein E [Source:HGNC Symbol%3BAcc:HGNC:613];gene_id=ENSG00000130203;logic_name=ensembl_havana_gene_homo_sapiens;version=10 19 havana mRNA 44905791 44908944 . + . ID=transcript:ENST00000446996;Parent=gene:ENSG00000130203;Name=APOE-204;biotype=protein_coding;transcript_id=ENST00000446996;transcript_support_level=2;version=5 19 havana exon 44905791 44905841 . + . Parent=transcript:ENST00000446996;Name=ENSE00001768924;constitutive=0;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=ENSE00001768924;rank=1;version=1 19 havana five_prime_UTR 44905791 44905841 . + . Parent=transcript:ENST00000446996 19 havana five_prime_UTR 44906587 44906624 . + . Parent=transcript:ENST00000446996 19 havana exon 44906587 44906667 . + . Parent=transcript:ENST00000446996;Name=ENSE00001667751;constitutive=0;ensembl_end_phase=1;ensembl_phase=-1;exon_id=ENSE00001667751;rank=2;version=1 19 havana CDS 44906625 44906667 . + 0 ID=CDS:ENSP00000413135;Parent=transcript:ENST00000446996;protein_id=ENSP00000413135 19 havana exon 44907760 44907952 . + . Parent=transcript:ENST00000446996;Name=ENSE00000893952;constitutive=0;ensembl_end_phase=2;ensembl_phase=1;exon_id=ENSE00000893952;rank=3;version=1 19 havana CDS 44907760 44907952 . + 2 ID=CDS:ENSP00000413135;Parent=transcript:ENST00000446996;protein_id=ENSP00000413135 19 havana exon 44908533 44908944 . + . Parent=transcript:ENST00000446996;Name=ENSE00001664168;constitutive=0;ensembl_end_phase=0;ensembl_phase=2;exon_id=ENSE00001664168;rank=4;version=1 19 havana CDS 44908533 44908944 . + 1 ID=CDS:ENSP00000413135;Parent=transcript:ENST00000446996;protein_id=ENSP00000413135 19 havana lnc_RNA 44905796 44907326 . + . ID=transcript:ENST00000485628;Parent=gene:ENSG00000130203;Name=APOE-205;biotype=retained_intron;transcript_id=ENST00000485628;transcript_support_level=1;version=2 19 havana exon 44905796 44905841 . + . Parent=transcript:ENST00000485628;Name=ENSE00001048576;constitutive=0;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=ENSE00001048576;rank=1;version=3 19 havana exon 44906602 44907326 . + . Parent=transcript:ENST00000485628;Name=ENSE00001943579;constitutive=0;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=ENSE00001943579;rank=2;version=2 19 ensembl_havana mRNA 44905796 44909393 . + . ID=transcript:ENST00000252486;Parent=gene:ENSG00000130203;Name=APOE-201;biotype=protein_coding;ccdsid=CCDS12647.1;tag=basic,Ensembl_canonical,MANE_Select;transcript_id=ENST00000252486;transcript_support_level=1 (assigned to previous version 8);version=9 19 ensembl_havana exon 44905796 44905841 . + . Parent=transcript:ENST00000252486;Name=ENSE00001048576;constitutive=0;ensembl_end_phase=-1;ensembl_phase=-1;exon_id=ENSE00001048576;rank=1;version=3 19 ensembl_havana five_prime_UTR 44905796 44905841 . + . Parent=transcript:ENST00000252486 19 ensembl_havana five_prime_UTR 44906602 44906624 . + . Parent=transcript:ENST00000252486 19 ensembl_havana exon 44906602 44906667 . + . Parent=transcript:ENST00000252486;Name=ENSE00003577086;constitutive=0;ensembl_end_phase=1;ensembl_phase=-1;exon_id=ENSE00003577086;rank=2;version=1 19 ensembl_havana CDS 44906625 44906667 . + 0 ID=CDS:ENSP00000252486;Parent=transcript:ENST00000252486;protein_id=ENSP00000252486 19 ensembl_havana exon 44907760 44907952 . + . Parent=transcript:ENST00000252486;Name=ENSE00000893952;constitutive=0;ensembl_end_phase=2;ensembl_phase=1;exon_id=ENSE00000893952;rank=3;version=1 19 ensembl_havana CDS 44907760 44907952 . + 2 ID=CDS:ENSP00000252486;Parent=transcript:ENST00000252486;protein_id=ENSP00000252486 19 ensembl_havana CDS 44908533 44909250 . + 1 ID=CDS:ENSP00000252486;Parent=transcript:ENST00000252486;protein_id=ENSP00000252486 19 ensembl_havana exon 44908533 44909393 . + . Parent=transcript:ENST00000252486;Name=ENSE00000893954;constitutive=0;ensembl_end_phase=-1;ensembl_phase=2;exon_id=ENSE00000893954;rank=4;version=3 19 ensembl_havana three_prime_UTR 44909251 44909393 . + . Parent=transcript:ENST00000252486 19 havana mRNA 44905812 44909025 . + . ID=transcript:ENST00000434152;Parent=gene:ENSG00000130203;Name=APOE-203;biotype=protein_coding;transcript_id=ENST00000434152;transcript_support_level=2;version=5 19 havana five_prime_UTR 44905812 44905868 . + . Parent=transcript:ENST00000434152 19 havana exon 44905812 44905923 . + . Parent=transcript:ENST00000434152;Name=ENSE00001601606;constitutive=0;ensembl_end_phase=1;ensembl_phase=-1;exon_id=ENSE00001601606;rank=1;version=1 19 havana CDS 44905869 44905923 . + 0 ID=CDS:ENSP00000413653;Parent=transcript:ENST00000434152;protein_id=ENSP00000413653 19 havana exon 44906602 44906667 . + . Parent=transcript:ENST00000434152;Name=ENSE00003463686;constitutive=0;ensembl_end_phase=1;ensembl_phase=1;exon_id=ENSE00003463686;rank=2;version=1 19 havana CDS 44906602 44906667 . + 2 ID=CDS:ENSP00000413653;Parent=transcript:ENST00000434152;protein_id=ENSP00000413653 19 havana exon 44907760 44907952 . + . Parent=transcript:ENST00000434152;Name=ENSE00000893952;constitutive=0;ensembl_end_phase=2;ensembl_phase=1;exon_id=ENSE00000893952;rank=3;version=1 19 havana CDS 44907760 44907952 . + 2 ID=CDS:ENSP00000413653;Parent=transcript:ENST00000434152;protein_id=ENSP00000413653 19 havana exon 44908533 44909025 . + . Parent=transcript:ENST00000434152;Name=ENSE00001700162;constitutive=0;ensembl_end_phase=0;ensembl_phase=2;exon_id=ENSE00001700162;rank=4;version=1 19 havana CDS 44908533 44909025 . + 1 ID=CDS:ENSP00000413653;Parent=transcript:ENST00000434152;protein_id=ENSP00000413653 19 havana mRNA 44906360 44908954 . + . ID=transcript:ENST00000425718;Parent=gene:ENSG00000130203;Name=APOE-202;biotype=protein_coding;transcript_id=ENST00000425718;transcript_support_level=1;version=1 19 havana five_prime_UTR 44906360 44906624 . + . Parent=transcript:ENST00000425718 19 havana exon 44906360 44906667 . + . Parent=transcript:ENST00000425718;Name=ENSE00001620702;constitutive=0;ensembl_end_phase=1;ensembl_phase=-1;exon_id=ENSE00001620702;rank=1;version=1 19 havana CDS 44906625 44906667 . + 0 ID=CDS:ENSP00000410423;Parent=transcript:ENST00000425718;protein_id=ENSP00000410423 19 havana exon 44907760 44907952 . + . Parent=transcript:ENST00000425718;Name=ENSE00000893952;constitutive=0;ensembl_end_phase=2;ensembl_phase=1;exon_id=ENSE00000893952;rank=2;version=1 19 havana CDS 44907760 44907952 . + 2 ID=CDS:ENSP00000410423;Parent=transcript:ENST00000425718;protein_id=ENSP00000410423 19 havana exon 44908533 44908954 . + . Parent=transcript:ENST00000425718;Name=ENSE00001599675;constitutive=0;ensembl_end_phase=1;ensembl_phase=2;exon_id=ENSE00001599675;rank=3;version=1 19 havana CDS 44908533 44908954 . + 1 ID=CDS:ENSP00000410423;Parent=transcript:ENST00000425718;protein_id=ENSP00000410423 ```

RefSeq GRCh38

See GFF3 ``` ##gff-version 3 #!gff-spec-version 1.21 #!processor NCBI annotwriter #!genome-build GRCh38.p14 #!genome-build-accession NCBI_Assembly:GCF_000001405.40 #!annotation-date 10/02/2023 #!annotation-source NCBI RefSeq GCF_000001405.40-RS_2023_10 ##sequence-region NC_000019.10 1 58617616 ##species https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=9606 NC_000019.10 BestRefSeq gene 44905796 44909393 . + . ID=gene-APOE;Dbxref=GeneID:348,HGNC:HGNC:613,MIM:107741;Name=APOE;description=apolipoprotein E;gbkey=Gene;gene=APOE;gene_biotype=protein_coding;gene_synonym=AD2,APO-E,ApoE4,LDLCQ5,LPG NC_000019.10 BestRefSeq mRNA 44905796 44909393 . + . ID=rna-NM_001302688.2;Parent=gene-APOE;Dbxref=GeneID:348,GenBank:NM_001302688.2,HGNC:HGNC:613,MIM:107741;Name=NM_001302688.2;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 1;transcript_id=NM_001302688.2 NC_000019.10 BestRefSeq exon 44905796 44905923 . + . ID=exon-NM_001302688.2-1;Parent=rna-NM_001302688.2;Dbxref=GeneID:348,GenBank:NM_001302688.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 1;transcript_id=NM_001302688.2 NC_000019.10 BestRefSeq exon 44906602 44906667 . + . ID=exon-NM_001302688.2-2;Parent=rna-NM_001302688.2;Dbxref=GeneID:348,GenBank:NM_001302688.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 1;transcript_id=NM_001302688.2 NC_000019.10 BestRefSeq exon 44907760 44907952 . + . ID=exon-NM_001302688.2-3;Parent=rna-NM_001302688.2;Dbxref=GeneID:348,GenBank:NM_001302688.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 1;transcript_id=NM_001302688.2 NC_000019.10 BestRefSeq exon 44908533 44909393 . + . ID=exon-NM_001302688.2-4;Parent=rna-NM_001302688.2;Dbxref=GeneID:348,GenBank:NM_001302688.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 1;transcript_id=NM_001302688.2 NC_000019.10 BestRefSeq CDS 44905869 44905923 . + 0 ID=cds-NP_001289617.1;Parent=rna-NM_001302688.2;Dbxref=GeneID:348,GenBank:NP_001289617.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289617.1;Note=isoform a precursor is encoded by transcript variant 1;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform a precursor;protein_id=NP_001289617.1 NC_000019.10 BestRefSeq CDS 44906602 44906667 . + 2 ID=cds-NP_001289617.1;Parent=rna-NM_001302688.2;Dbxref=GeneID:348,GenBank:NP_001289617.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289617.1;Note=isoform a precursor is encoded by transcript variant 1;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform a precursor;protein_id=NP_001289617.1 NC_000019.10 BestRefSeq CDS 44907760 44907952 . + 2 ID=cds-NP_001289617.1;Parent=rna-NM_001302688.2;Dbxref=GeneID:348,GenBank:NP_001289617.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289617.1;Note=isoform a precursor is encoded by transcript variant 1;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform a precursor;protein_id=NP_001289617.1 NC_000019.10 BestRefSeq CDS 44908533 44909250 . + 1 ID=cds-NP_001289617.1;Parent=rna-NM_001302688.2;Dbxref=GeneID:348,GenBank:NP_001289617.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289617.1;Note=isoform a precursor is encoded by transcript variant 1;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform a precursor;protein_id=NP_001289617.1 NC_000019.10 BestRefSeq mRNA 44905796 44909393 . + . ID=rna-NM_001302691.2;Parent=gene-APOE;Dbxref=GeneID:348,GenBank:NM_001302691.2,HGNC:HGNC:613,MIM:107741;Name=NM_001302691.2;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 5;transcript_id=NM_001302691.2 NC_000019.10 BestRefSeq exon 44905796 44905841 . + . ID=exon-NM_001302691.2-1;Parent=rna-NM_001302691.2;Dbxref=GeneID:348,GenBank:NM_001302691.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 5;transcript_id=NM_001302691.2 NC_000019.10 BestRefSeq exon 44906587 44906667 . + . ID=exon-NM_001302691.2-2;Parent=rna-NM_001302691.2;Dbxref=GeneID:348,GenBank:NM_001302691.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 5;transcript_id=NM_001302691.2 NC_000019.10 BestRefSeq exon 44907760 44907952 . + . ID=exon-NM_001302691.2-3;Parent=rna-NM_001302691.2;Dbxref=GeneID:348,GenBank:NM_001302691.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 5;transcript_id=NM_001302691.2 NC_000019.10 BestRefSeq exon 44908533 44909393 . + . ID=exon-NM_001302691.2-4;Parent=rna-NM_001302691.2;Dbxref=GeneID:348,GenBank:NM_001302691.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 5;transcript_id=NM_001302691.2 NC_000019.10 BestRefSeq CDS 44906625 44906667 . + 0 ID=cds-NP_001289620.1;Parent=rna-NM_001302691.2;Dbxref=CCDS:CCDS12647.1,GeneID:348,GenBank:NP_001289620.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289620.1;Note=isoform b precursor is encoded by transcript variant 5;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_001289620.1 NC_000019.10 BestRefSeq CDS 44907760 44907952 . + 2 ID=cds-NP_001289620.1;Parent=rna-NM_001302691.2;Dbxref=CCDS:CCDS12647.1,GeneID:348,GenBank:NP_001289620.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289620.1;Note=isoform b precursor is encoded by transcript variant 5;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_001289620.1 NC_000019.10 BestRefSeq CDS 44908533 44909250 . + 1 ID=cds-NP_001289620.1;Parent=rna-NM_001302691.2;Dbxref=CCDS:CCDS12647.1,GeneID:348,GenBank:NP_001289620.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289620.1;Note=isoform b precursor is encoded by transcript variant 5;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_001289620.1 NC_000019.10 BestRefSeq mRNA 44905796 44909393 . + . ID=rna-NM_000041.4;Parent=gene-APOE;Dbxref=Ensembl:ENST00000252486.9,GeneID:348,GenBank:NM_000041.4,HGNC:HGNC:613,MIM:107741;Name=NM_000041.4;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 2;tag=MANE Select;transcript_id=NM_000041.4 NC_000019.10 BestRefSeq exon 44905796 44905841 . + . ID=exon-NM_000041.4-1;Parent=rna-NM_000041.4;Dbxref=Ensembl:ENST00000252486.9,GeneID:348,GenBank:NM_000041.4,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 2;tag=MANE Select;transcript_id=NM_000041.4 NC_000019.10 BestRefSeq exon 44906602 44906667 . + . ID=exon-NM_000041.4-2;Parent=rna-NM_000041.4;Dbxref=Ensembl:ENST00000252486.9,GeneID:348,GenBank:NM_000041.4,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 2;tag=MANE Select;transcript_id=NM_000041.4 NC_000019.10 BestRefSeq exon 44907760 44907952 . + . ID=exon-NM_000041.4-3;Parent=rna-NM_000041.4;Dbxref=Ensembl:ENST00000252486.9,GeneID:348,GenBank:NM_000041.4,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 2;tag=MANE Select;transcript_id=NM_000041.4 NC_000019.10 BestRefSeq exon 44908533 44909393 . + . ID=exon-NM_000041.4-4;Parent=rna-NM_000041.4;Dbxref=Ensembl:ENST00000252486.9,GeneID:348,GenBank:NM_000041.4,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 2;tag=MANE Select;transcript_id=NM_000041.4 NC_000019.10 BestRefSeq CDS 44906625 44906667 . + 0 ID=cds-NP_000032.1;Parent=rna-NM_000041.4;Dbxref=CCDS:CCDS12647.1,Ensembl:ENSP00000252486.3,GeneID:348,GenBank:NP_000032.1,HGNC:HGNC:613,MIM:107741;Name=NP_000032.1;Note=isoform b precursor is encoded by transcript variant 2;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_000032.1;tag=MANE Select NC_000019.10 BestRefSeq CDS 44907760 44907952 . + 2 ID=cds-NP_000032.1;Parent=rna-NM_000041.4;Dbxref=CCDS:CCDS12647.1,Ensembl:ENSP00000252486.3,GeneID:348,GenBank:NP_000032.1,HGNC:HGNC:613,MIM:107741;Name=NP_000032.1;Note=isoform b precursor is encoded by transcript variant 2;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_000032.1;tag=MANE Select NC_000019.10 BestRefSeq CDS 44908533 44909250 . + 1 ID=cds-NP_000032.1;Parent=rna-NM_000041.4;Dbxref=CCDS:CCDS12647.1,Ensembl:ENSP00000252486.3,GeneID:348,GenBank:NP_000032.1,HGNC:HGNC:613,MIM:107741;Name=NP_000032.1;Note=isoform b precursor is encoded by transcript variant 2;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_000032.1;tag=MANE Select NC_000019.10 BestRefSeq mRNA 44906021 44909393 . + . ID=rna-NM_001302689.2;Parent=gene-APOE;Dbxref=GeneID:348,GenBank:NM_001302689.2,HGNC:HGNC:613,MIM:107741;Name=NM_001302689.2;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 3;transcript_id=NM_001302689.2 NC_000019.10 BestRefSeq exon 44906021 44906044 . + . ID=exon-NM_001302689.2-1;Parent=rna-NM_001302689.2;Dbxref=GeneID:348,GenBank:NM_001302689.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 3;transcript_id=NM_001302689.2 NC_000019.10 BestRefSeq exon 44906602 44906667 . + . ID=exon-NM_001302689.2-2;Parent=rna-NM_001302689.2;Dbxref=GeneID:348,GenBank:NM_001302689.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 3;transcript_id=NM_001302689.2 NC_000019.10 BestRefSeq exon 44907760 44907952 . + . ID=exon-NM_001302689.2-3;Parent=rna-NM_001302689.2;Dbxref=GeneID:348,GenBank:NM_001302689.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 3;transcript_id=NM_001302689.2 NC_000019.10 BestRefSeq exon 44908533 44909393 . + . ID=exon-NM_001302689.2-4;Parent=rna-NM_001302689.2;Dbxref=GeneID:348,GenBank:NM_001302689.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 3;transcript_id=NM_001302689.2 NC_000019.10 BestRefSeq CDS 44906625 44906667 . + 0 ID=cds-NP_001289618.1;Parent=rna-NM_001302689.2;Dbxref=CCDS:CCDS12647.1,GeneID:348,GenBank:NP_001289618.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289618.1;Note=isoform b precursor is encoded by transcript variant 3;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_001289618.1 NC_000019.10 BestRefSeq CDS 44907760 44907952 . + 2 ID=cds-NP_001289618.1;Parent=rna-NM_001302689.2;Dbxref=CCDS:CCDS12647.1,GeneID:348,GenBank:NP_001289618.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289618.1;Note=isoform b precursor is encoded by transcript variant 3;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_001289618.1 NC_000019.10 BestRefSeq CDS 44908533 44909250 . + 1 ID=cds-NP_001289618.1;Parent=rna-NM_001302689.2;Dbxref=CCDS:CCDS12647.1,GeneID:348,GenBank:NP_001289618.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289618.1;Note=isoform b precursor is encoded by transcript variant 3;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_001289618.1 NC_000019.10 BestRefSeq mRNA 44906401 44909393 . + . ID=rna-NM_001302690.2;Parent=gene-APOE;Dbxref=GeneID:348,GenBank:NM_001302690.2,HGNC:HGNC:613,MIM:107741;Name=NM_001302690.2;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 4;transcript_id=NM_001302690.2 NC_000019.10 BestRefSeq exon 44906401 44906524 . + . ID=exon-NM_001302690.2-1;Parent=rna-NM_001302690.2;Dbxref=GeneID:348,GenBank:NM_001302690.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 4;transcript_id=NM_001302690.2 NC_000019.10 BestRefSeq exon 44906602 44906667 . + . ID=exon-NM_001302690.2-2;Parent=rna-NM_001302690.2;Dbxref=GeneID:348,GenBank:NM_001302690.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 4;transcript_id=NM_001302690.2 NC_000019.10 BestRefSeq exon 44907760 44907952 . + . ID=exon-NM_001302690.2-3;Parent=rna-NM_001302690.2;Dbxref=GeneID:348,GenBank:NM_001302690.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 4;transcript_id=NM_001302690.2 NC_000019.10 BestRefSeq exon 44908533 44909393 . + . ID=exon-NM_001302690.2-4;Parent=rna-NM_001302690.2;Dbxref=GeneID:348,GenBank:NM_001302690.2,HGNC:HGNC:613,MIM:107741;gbkey=mRNA;gene=APOE;product=apolipoprotein E%2C transcript variant 4;transcript_id=NM_001302690.2 NC_000019.10 BestRefSeq CDS 44906625 44906667 . + 0 ID=cds-NP_001289619.1;Parent=rna-NM_001302690.2;Dbxref=CCDS:CCDS12647.1,GeneID:348,GenBank:NP_001289619.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289619.1;Note=isoform b precursor is encoded by transcript variant 4;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_001289619.1 NC_000019.10 BestRefSeq CDS 44907760 44907952 . + 2 ID=cds-NP_001289619.1;Parent=rna-NM_001302690.2;Dbxref=CCDS:CCDS12647.1,GeneID:348,GenBank:NP_001289619.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289619.1;Note=isoform b precursor is encoded by transcript variant 4;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_001289619.1 NC_000019.10 BestRefSeq CDS 44908533 44909250 . + 1 ID=cds-NP_001289619.1;Parent=rna-NM_001302690.2;Dbxref=CCDS:CCDS12647.1,GeneID:348,GenBank:NP_001289619.1,HGNC:HGNC:613,MIM:107741;Name=NP_001289619.1;Note=isoform b precursor is encoded by transcript variant 4;gbkey=CDS;gene=APOE;product=apolipoprotein E isoform b precursor;protein_id=NP_001289619.1 ```

Wormbase C. elegans

See GFF3 ``` ##gff-version 3 ##sequence-region III 1 13783801 III WormBase gene 3573581 3578771 . + . ID=Gene:WBGene00003605;Name=WBGene00003605;locus=nhr-6;sequence_name=C48D5.1;biotype=protein_coding;so_term_name=protein_coding_gene;curie=WB:WBGene00003605;Alias=nhr-6,C48D5.1 III WormBase mRNA 3573581 3578771 . + . ID=Transcript:C48D5.1a.1;Parent=Gene:WBGene00003605;Name=C48D5.1a.1;wormpep=CE24859;locus=nhr-6;uniprot_id=P41829 III WormBase exon 3573581 3573740 . + . Parent=Transcript:C48D5.1a.1 III WormBase five_prime_UTR 3573581 3573698 . + . Parent=Transcript:C48D5.1a.1 III WormBase gene 3573678 3573736 . - . ID=Gene:WBGene00200413;Name=WBGene00200413;interpolated_map_position=-5.51505;sequence_name=C48D5.8;biotype=ncRNA;so_term_name=ncRNA_gene;curie=WB:WBGene00200413;Alias=C48D5.8 III WormBase ncRNA 3573678 3573736 . - . ID=Transcript:C48D5.8;Parent=Gene:WBGene00200413;Name=C48D5.8 III WormBase exon 3573678 3573736 . - . Parent=Transcript:C48D5.8 III WormBase CDS 3573699 3573740 . + 0 ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829 III WormBase CDS 3573858 3573923 . + 0 ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829 III WormBase CDS 3574148 3574250 . + 0 ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829 III WormBase CDS 3574294 3574376 . + 2 ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829 III WormBase CDS 3576192 3576269 . + 0 ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829 III WormBase CDS 3576317 3576468 . + 0 ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829 III WormBase CDS 3576609 3576803 . + 1 ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829 III WormBase CDS 3576870 3577159 . + 1 ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829 III WormBase CDS 3577205 3577362 . + 2 ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829 III WormBase CDS 3577410 3577770 . + 0 ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829 III WormBase CDS 3577848 3577969 . + 2 ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829 III WormBase CDS 3578181 3578390 . + 0 ID=CDS:C48D5.1a;Parent=Transcript:C48D5.1a.1;Name=C48D5.1a;prediction_status=Confirmed;wormpep=CE24859;protein_id=CAA85271.2;locus=nhr-6;uniprot_id=P41829 III WormBase intron 3573741 3573857 . + . Parent=Transcript:C48D5.1a.1;Note=Confirmed_EST CK584632 %3B Confirmed_cDNA AY204167 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B III WormBase mRNA 3573851 3578390 . + . ID=Transcript:C48D5.1b.1;Parent=Gene:WBGene00003605;Name=C48D5.1b.1;wormpep=CE42591;locus=nhr-6;uniprot_id=P41829 III WormBase exon 3573851 3573923 . + . Parent=Transcript:C48D5.1b.1 III WormBase five_prime_UTR 3573851 3573923 . + . Parent=Transcript:C48D5.1b.1 III WormBase exon 3573858 3573923 . + . Parent=Transcript:C48D5.1a.1 III WormBase intron 3573924 3574147 . + . Parent=Transcript:C48D5.1a.1;Note=Confirmed_EST CK584062 %3B Confirmed_cDNA AY204167 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B III WormBase intron 3573924 3574147 . + . Parent=Transcript:C48D5.1b.1;Note=Confirmed_EST CK584062 %3B Confirmed_cDNA AY204167 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B Confirmed_EST L1_Nanopore_Roach_77626 %3B III WormBase exon 3574148 3574250 . + . Parent=Transcript:C48D5.1a.1 III WormBase exon 3574148 3574250 . + . Parent=Transcript:C48D5.1b.1 III WormBase five_prime_UTR 3574148 3574250 . + . Parent=Transcript:C48D5.1b.1 III WormBase intron 3574251 3574293 . + . Parent=Transcript:C48D5.1a.1;Note=Confirmed_EST CK584062 %3B Confirmed_cDNA AY204167 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B III WormBase intron 3574251 3574293 . + . Parent=Transcript:C48D5.1b.1;Note=Confirmed_EST CK584062 %3B Confirmed_cDNA AY204167 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B III WormBase exon 3574294 3574376 . + . Parent=Transcript:C48D5.1a.1 III WormBase exon 3574294 3574376 . + . Parent=Transcript:C48D5.1b.1 III WormBase five_prime_UTR 3574294 3574376 . + . Parent=Transcript:C48D5.1b.1 III WormBase intron 3574377 3576191 . + . Parent=Transcript:C48D5.1a.1;Note=Confirmed_EST CK582994 %3B Confirmed_cDNA AY204167 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B III WormBase intron 3574377 3576191 . + . Parent=Transcript:C48D5.1b.1;Note=Confirmed_EST CK582994 %3B Confirmed_cDNA AY204167 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B III WormBase mRNA 3576060 3578754 . + . ID=Transcript:C48D5.1b.2;Parent=Gene:WBGene00003605;Name=C48D5.1b.2;wormpep=CE42591;locus=nhr-6;uniprot_id=P41829 III WormBase exon 3576060 3576269 . + . Parent=Transcript:C48D5.1b.2 III WormBase five_prime_UTR 3576060 3576269 . + . Parent=Transcript:C48D5.1b.2 III WormBase exon 3576192 3576269 . + . Parent=Transcript:C48D5.1a.1 III WormBase exon 3576192 3576269 . + . Parent=Transcript:C48D5.1b.1 III WormBase five_prime_UTR 3576192 3576269 . + . Parent=Transcript:C48D5.1b.1 III WormBase intron 3576270 3576316 . + . Parent=Transcript:C48D5.1a.1;Note=Confirmed_EST CK582994 %3B Confirmed_cDNA AY204167 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B III WormBase intron 3576270 3576316 . + . Parent=Transcript:C48D5.1b.1;Note=Confirmed_EST CK582994 %3B Confirmed_cDNA AY204167 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B III WormBase intron 3576270 3576316 . + . Parent=Transcript:C48D5.1b.2;Note=Confirmed_EST CK582994 %3B Confirmed_cDNA AY204167 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B III WormBase five_prime_UTR 3576317 3576352 . + . Parent=Transcript:C48D5.1b.1 III WormBase five_prime_UTR 3576317 3576352 . + . Parent=Transcript:C48D5.1b.2 III WormBase exon 3576317 3576468 . + . Parent=Transcript:C48D5.1a.1 III WormBase exon 3576317 3576468 . + . Parent=Transcript:C48D5.1b.1 III WormBase exon 3576317 3576468 . + . Parent=Transcript:C48D5.1b.2 III WormBase CDS 3576353 3576468 . + 0 ID=CDS:C48D5.1b;Parent=Transcript:C48D5.1b.1,Transcript:C48D5.1b.2;Name=C48D5.1b;prediction_status=Confirmed;wormpep=CE42591;protein_id=CAQ48391.1;locus=nhr-6;uniprot_id=P41829 III WormBase CDS 3576609 3576803 . + 1 ID=CDS:C48D5.1b;Parent=Transcript:C48D5.1b.1,Transcript:C48D5.1b.2;Name=C48D5.1b;prediction_status=Confirmed;wormpep=CE42591;protein_id=CAQ48391.1;locus=nhr-6;uniprot_id=P41829 III WormBase CDS 3576870 3577159 . + 1 ID=CDS:C48D5.1b;Parent=Transcript:C48D5.1b.1,Transcript:C48D5.1b.2;Name=C48D5.1b;prediction_status=Confirmed;wormpep=CE42591;protein_id=CAQ48391.1;locus=nhr-6;uniprot_id=P41829 III WormBase CDS 3577205 3577362 . + 2 ID=CDS:C48D5.1b;Parent=Transcript:C48D5.1b.1,Transcript:C48D5.1b.2;Name=C48D5.1b;prediction_status=Confirmed;wormpep=CE42591;protein_id=CAQ48391.1;locus=nhr-6;uniprot_id=P41829 III WormBase CDS 3577410 3577770 . + 0 ID=CDS:C48D5.1b;Parent=Transcript:C48D5.1b.1,Transcript:C48D5.1b.2;Name=C48D5.1b;prediction_status=Confirmed;wormpep=CE42591;protein_id=CAQ48391.1;locus=nhr-6;uniprot_id=P41829 III WormBase CDS 3577848 3577969 . + 2 ID=CDS:C48D5.1b;Parent=Transcript:C48D5.1b.1,Transcript:C48D5.1b.2;Name=C48D5.1b;prediction_status=Confirmed;wormpep=CE42591;protein_id=CAQ48391.1;locus=nhr-6;uniprot_id=P41829 III WormBase CDS 3578181 3578390 . + 0 ID=CDS:C48D5.1b;Parent=Transcript:C48D5.1b.1,Transcript:C48D5.1b.2;Name=C48D5.1b;prediction_status=Confirmed;wormpep=CE42591;protein_id=CAQ48391.1;locus=nhr-6;uniprot_id=P41829 III WormBase intron 3576469 3576608 . + . Parent=Transcript:C48D5.1a.1;Note=Confirmed_EST CK579197 %3B Confirmed_cDNA U13076 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B III WormBase intron 3576469 3576608 . + . Parent=Transcript:C48D5.1b.1;Note=Confirmed_EST CK579197 %3B Confirmed_cDNA U13076 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B III WormBase intron 3576469 3576608 . + . Parent=Transcript:C48D5.1b.2;Note=Confirmed_EST CK579197 %3B Confirmed_cDNA U13076 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST OSTF021G8_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B III WormBase exon 3576609 3576803 . + . Parent=Transcript:C48D5.1a.1 III WormBase exon 3576609 3576803 . + . Parent=Transcript:C48D5.1b.1 III WormBase exon 3576609 3576803 . + . Parent=Transcript:C48D5.1b.2 III WormBase intron 3576804 3576869 . + . Parent=Transcript:C48D5.1a.1;Note=Confirmed_EST CK579197 %3B Confirmed_cDNA U13076 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B III WormBase intron 3576804 3576869 . + . Parent=Transcript:C48D5.1b.1;Note=Confirmed_EST CK579197 %3B Confirmed_cDNA U13076 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B III WormBase intron 3576804 3576869 . + . Parent=Transcript:C48D5.1b.2;Note=Confirmed_EST CK579197 %3B Confirmed_cDNA U13076 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B III WormBase exon 3576870 3577159 . + . Parent=Transcript:C48D5.1a.1 III WormBase exon 3576870 3577159 . + . Parent=Transcript:C48D5.1b.1 III WormBase exon 3576870 3577159 . + . Parent=Transcript:C48D5.1b.2 III WormBase intron 3577160 3577204 . + . Parent=Transcript:C48D5.1a.1;Note=Confirmed_EST CK582942 %3B Confirmed_cDNA U13076 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B III WormBase intron 3577160 3577204 . + . Parent=Transcript:C48D5.1b.1;Note=Confirmed_EST CK582942 %3B Confirmed_cDNA U13076 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B III WormBase intron 3577160 3577204 . + . Parent=Transcript:C48D5.1b.2;Note=Confirmed_EST CK582942 %3B Confirmed_cDNA U13076 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST OSTF020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B III WormBase exon 3577205 3577362 . + . Parent=Transcript:C48D5.1a.1 III WormBase exon 3577205 3577362 . + . Parent=Transcript:C48D5.1b.1 III WormBase exon 3577205 3577362 . + . Parent=Transcript:C48D5.1b.2 III WormBase intron 3577363 3577409 . + . Parent=Transcript:C48D5.1a.1;Note=Confirmed_EST CK581015 %3B Confirmed_cDNA U13076 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B III WormBase intron 3577363 3577409 . + . Parent=Transcript:C48D5.1b.1;Note=Confirmed_EST CK581015 %3B Confirmed_cDNA U13076 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B III WormBase intron 3577363 3577409 . + . Parent=Transcript:C48D5.1b.2;Note=Confirmed_EST CK581015 %3B Confirmed_cDNA U13076 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B III WormBase exon 3577410 3577770 . + . Parent=Transcript:C48D5.1a.1 III WormBase exon 3577410 3577770 . + . Parent=Transcript:C48D5.1b.1 III WormBase exon 3577410 3577770 . + . Parent=Transcript:C48D5.1b.2 III WormBase intron 3577771 3577847 . + . Parent=Transcript:C48D5.1a.1;Note=Confirmed_cDNA U13076 %3B Confirmed_EST OSTR020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B III WormBase intron 3577771 3577847 . + . Parent=Transcript:C48D5.1b.1;Note=Confirmed_cDNA U13076 %3B Confirmed_EST OSTR020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B III WormBase intron 3577771 3577847 . + . Parent=Transcript:C48D5.1b.2;Note=Confirmed_cDNA U13076 %3B Confirmed_EST OSTR020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B Confirmed_EST adult_Nanopore_Roach_18170 %3B III WormBase exon 3577848 3577969 . + . Parent=Transcript:C48D5.1a.1 III WormBase exon 3577848 3577969 . + . Parent=Transcript:C48D5.1b.1 III WormBase exon 3577848 3577969 . + . Parent=Transcript:C48D5.1b.2 III WormBase intron 3577970 3578180 . + . Parent=Transcript:C48D5.1a.1;Note=Confirmed_cDNA U13076 %3B Confirmed_EST OSTR020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B III WormBase intron 3577970 3578180 . + . Parent=Transcript:C48D5.1b.1;Note=Confirmed_cDNA U13076 %3B Confirmed_EST OSTR020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B III WormBase intron 3577970 3578180 . + . Parent=Transcript:C48D5.1b.2;Note=Confirmed_cDNA U13076 %3B Confirmed_EST OSTR020G2_1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc1_g1_i1 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B Confirmed_EST L1_Nanopore_Roach_27632 %3B III WormBase exon 3578181 3578390 . + . Parent=Transcript:C48D5.1b.1 III WormBase exon 3578181 3578754 . + . Parent=Transcript:C48D5.1b.2 III WormBase exon 3578181 3578771 . + . Parent=Transcript:C48D5.1a.1 III WormBase three_prime_UTR 3578391 3578754 . + . Parent=Transcript:C48D5.1b.2 III WormBase three_prime_UTR 3578391 3578771 . + . Parent=Transcript:C48D5.1a.1 III WormBase gene 3578695 3581736 . - . ID=Gene:WBGene00008180;Name=WBGene00008180;interpolated_map_position=-5.46529;sequence_name=C48D5.3;biotype=protein_coding;so_term_name=protein_coding_gene;curie=WB:WBGene00008180;Alias=C48D5.3 III WormBase mRNA 3578695 3581736 . - . ID=Transcript:C48D5.3.1;Parent=Gene:WBGene00008180;Name=C48D5.3.1;wormpep=CE44237;uniprot_id=Q7YX49 III WormBase three_prime_UTR 3578695 3578881 . - . Parent=Transcript:C48D5.3.1 III WormBase exon 3578695 3579024 . - . Parent=Transcript:C48D5.3.1 III WormBase CDS 3578882 3579024 . - 2 ID=CDS:C48D5.3;Parent=Transcript:C48D5.3.1;Name=C48D5.3;prediction_status=Confirmed;wormpep=CE44237;protein_id=CAE17761.2;uniprot_id=Q7YX49 III WormBase CDS 3580321 3580458 . - 2 ID=CDS:C48D5.3;Parent=Transcript:C48D5.3.1;Name=C48D5.3;prediction_status=Confirmed;wormpep=CE44237;protein_id=CAE17761.2;uniprot_id=Q7YX49 III WormBase CDS 3581261 3581341 . - 2 ID=CDS:C48D5.3;Parent=Transcript:C48D5.3.1;Name=C48D5.3;prediction_status=Confirmed;wormpep=CE44237;protein_id=CAE17761.2;uniprot_id=Q7YX49 III WormBase CDS 3581558 3581693 . - 0 ID=CDS:C48D5.3;Parent=Transcript:C48D5.3.1;Name=C48D5.3;prediction_status=Confirmed;wormpep=CE44237;protein_id=CAE17761.2;uniprot_id=Q7YX49 III WormBase intron 3579025 3580320 . - . Parent=Transcript:C48D5.3.1;Note=Confirmed_EST FM247012 %3B Confirmed_EST elegans_PE_SS_GG2424%7Cc0_g1_i1 %3B Confirmed_EST L2_Nanopore_Roach_76766 %3B Confirmed_EST L2_Nanopore_Roach_76766 %3B III WormBase exon 3580321 3580458 . - . Parent=Transcript:C48D5.3.1 III WormBase intron 3580459 3581260 . - . Parent=Transcript:C48D5.3.1;Note=Confirmed_EST FM247012 %3B Confirmed_EST elegans_PE_SS_GG2424%7Cc0_g1_i1 %3B Confirmed_EST L3_Nanopore_Roach_42997 %3B Confirmed_EST L3_Nanopore_Roach_42997 %3B III WormBase exon 3581261 3581341 . - . Parent=Transcript:C48D5.3.1 III WormBase intron 3581342 3581557 . - . Parent=Transcript:C48D5.3.1;Note=Confirmed_EST FM247012 %3B Confirmed_EST elegans_PE_SS_GG2424%7Cc0_g1_i1 %3B Confirmed_EST L3_Nanopore_Roach_42997 %3B Confirmed_EST L3_Nanopore_Roach_42997 %3B III WormBase exon 3581558 3581736 . - . Parent=Transcript:C48D5.3.1 III WormBase five_prime_UTR 3581694 3581736 . - . Parent=Transcript:C48D5.3.1 III WormBase gene 3586143 3586247 . - . ID=Gene:WBGene00199419;Name=WBGene00199419;interpolated_map_position=-5.42563;sequence_name=C48D5.7;biotype=ncRNA;so_term_name=ncRNA_gene;curie=WB:WBGene00199419;Alias=C48D5.7 III WormBase ncRNA 3586143 3586247 . - . ID=Transcript:C48D5.7;Parent=Gene:WBGene00199419;Name=C48D5.7 III WormBase exon 3586143 3586247 . - . Parent=Transcript:C48D5.7 III WormBase_transposon transposable_element 3586983 3587234 . + . ID=Transposon:Predicted_PALTTTAAA2_10197;Name=Predicted_PALTTTAAA2_10197;family=PALTTTAAA2 III WormBase gene 3587601 3587747 . - . ID=Gene:WBGene00197446;Name=WBGene00197446;interpolated_map_position=-5.41583;sequence_name=C48D5.6;biotype=ncRNA;so_term_name=ncRNA_gene;curie=WB:WBGene00197446;Alias=C48D5.6 III WormBase ncRNA 3587601 3587747 . - . ID=Transcript:C48D5.6;Parent=Gene:WBGene00197446;Name=C48D5.6 III WormBase exon 3587601 3587747 . - . Parent=Transcript:C48D5.6 III WormBase_transposon transposable_element 3588011 3588163 . + . ID=Transposon:Predicted_HAT1_CE_10198;Name=Predicted_HAT1_CE_10198;family=HAT1_CE III WormBase gene 3588607 3588756 . - . ID=Gene:WBGene00196486;Name=WBGene00196486;interpolated_map_position=-5.40914;sequence_name=C48D5.5;biotype=ncRNA;so_term_name=ncRNA_gene;curie=WB:WBGene00196486;Alias=C48D5.5 III WormBase ncRNA 3588607 3588756 . - . ID=Transcript:C48D5.5;Parent=Gene:WBGene00196486;Name=C48D5.5 III WormBase exon 3588607 3588756 . - . Parent=Transcript:C48D5.5 III WormBase gene 3589207 3589355 . - . ID=Gene:WBGene00196329;Name=WBGene00196329;interpolated_map_position=-5.40517;sequence_name=C48D5.4;biotype=ncRNA;so_term_name=ncRNA_gene;curie=WB:WBGene00196329;Alias=C48D5.4 III WormBase ncRNA 3589207 3589355 . - . ID=Transcript:C48D5.4;Parent=Gene:WBGene00196329;Name=C48D5.4 III WormBase exon 3589207 3589355 . - . Parent=Transcript:C48D5.4 III WormBase gene 3590722 3611971 . + . ID=Gene:WBGene00004213;Name=WBGene00004213;locus=ptp-1;sequence_name=C48D5.2;biotype=protein_coding;so_term_name=protein_coding_gene;curie=WB:WBGene00004213;Alias=ptp-1,C48D5.2 III WormBase mRNA 3590722 3611971 . + . ID=Transcript:C48D5.2a.1;Parent=Gene:WBGene00004213;Name=C48D5.2a.1;wormpep=CE17578;locus=ptp-1;uniprot_id=P28191 III WormBase exon 3590722 3590907 . + . Parent=Transcript:C48D5.2a.1 III WormBase five_prime_UTR 3590722 3590769 . + . Parent=Transcript:C48D5.2a.1 III WormBase CDS 3590770 3590907 . + 0 ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191 III WormBase CDS 3591624 3591740 . + 0 ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191 III WormBase CDS 3591785 3592001 . + 0 ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191 III WormBase CDS 3592397 3592523 . + 2 ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191 III WormBase CDS 3592604 3592850 . + 1 ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191 III WormBase CDS 3593165 3593391 . + 0 ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191 III WormBase CDS 3594210 3594323 . + 1 ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191 III WormBase CDS 3594440 3594592 . + 1 ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191 III WormBase CDS 3604815 3604917 . + 1 ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191 III WormBase CDS 3607049 3607232 . + 0 ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191 III WormBase CDS 3607558 3607745 . + 2 ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191 III WormBase CDS 3608124 3608424 . + 0 ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191 III WormBase CDS 3608994 3609222 . + 2 ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191 III WormBase CDS 3609685 3609772 . + 1 ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191 III WormBase CDS 3609824 3610099 . + 0 ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191 III WormBase CDS 3610561 3610845 . + 0 ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191 III WormBase CDS 3611478 3611564 . + 0 ID=CDS:C48D5.2a;Parent=Transcript:C48D5.2a.1;Name=C48D5.2a;prediction_status=Confirmed;wormpep=CE17578;protein_id=CAA85272.1;locus=ptp-1;uniprot_id=P28191 III WormBase intron 3590908 3591623 . + . Parent=Transcript:C48D5.2a.1;Note=Confirmed_EST yk417b1.5 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc2_g2_i2 %3B Confirmed_EST L2_Nanopore_Roach_6195 %3B Confirmed_EST L2_Nanopore_Roach_6195 %3B III WormBase exon 3591624 3591740 . + . Parent=Transcript:C48D5.2a.1 III WormBase intron 3591741 3591784 . + . Parent=Transcript:C48D5.2a.1;Note=Confirmed_EST FM247485 %3B Confirmed_EST elegans_PE_SS_GG6742%7Cc2_g2_i2 %3B Confirmed_EST L2_Nanopore_Roach_6195 %3B Confirmed_EST L2_Nanopore_Roach_6195 %3B ```

PlasmoDB P. falciparum

See GFF3 ``` ##gff-version 3 ##species http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=36329 ##sequence-region Pf3D7_01_v3 1 640851 Pf3D7_01_v3 VEuPathDB protein_coding_gene 74295 75622 . + . ID=PF3D7_0101300;Name=MC-2TM;description=Pfmc-2TM Maurer's cleft two transmembrane protein;ebi_biotype=protein_coding Pf3D7_01_v3 VEuPathDB mRNA 74295 75622 . + . ID=PF3D7_0101300.1;Parent=PF3D7_0101300;description=Pfmc-2TM Maurer's cleft two transmembrane protein;gene_ebi_biotype=protein_coding Pf3D7_01_v3 VEuPathDB exon 74295 74631 . + . ID=exon_PF3D7_0101300.1-E1;Parent=PF3D7_0101300.1;gene_id=PF3D7_0101300 Pf3D7_01_v3 VEuPathDB exon 74728 75622 . + . ID=exon_PF3D7_0101300.1-E2;Parent=PF3D7_0101300.1;gene_id=PF3D7_0101300 Pf3D7_01_v3 VEuPathDB CDS 74563 74631 . + 0 ID=PF3D7_0101300.1-p1-CDS1;Parent=PF3D7_0101300.1;gene_id=PF3D7_0101300;protein_source_id=PF3D7_0101300.1-p1 Pf3D7_01_v3 VEuPathDB CDS 74728 75366 . + 0 ID=PF3D7_0101300.1-p1-CDS2;Parent=PF3D7_0101300.1;gene_id=PF3D7_0101300;protein_source_id=PF3D7_0101300.1-p1 Pf3D7_01_v3 VEuPathDB five_prime_UTR 74295 74562 . + . ID=utr_PF3D7_0101300.1_1;Parent=PF3D7_0101300.1 Pf3D7_01_v3 VEuPathDB three_prime_UTR 75367 75622 . + . ID=utr_PF3D7_0101300.1_2;Parent=PF3D7_0101300.1 ```

We need to figure out what our standard internal representation will be so that we can start figuring out how to standardize the data.

dariober commented 2 months ago

Here's some other examples:

The Arabidopsis Information Resource

Source https://www.arabidopsis.org/download_files/Genes/TAIR10_genome_release/TAIR10_gff3/TAIR10_GFF3_genes_transposons.gff

See GFF3 ``` Chr1 TAIR10 gene 5928 8737 . - . ID=AT1G01020;Note=protein_coding_gene;Name=AT1G01020 Chr1 TAIR10 mRNA 5928 8737 . - . ID=AT1G01020.1;Parent=AT1G01020;Name=AT1G01020.1;Index=1 Chr1 TAIR10 protein 6915 8666 . - . ID=AT1G01020.1-Protein;Name=AT1G01020.1;Derives_from=AT1G01020.1 Chr1 TAIR10 five_prime_UTR 8667 8737 . - . Parent=AT1G01020.1 Chr1 TAIR10 CDS 8571 8666 . - 0 Parent=AT1G01020.1,AT1G01020.1-Protein; Chr1 TAIR10 exon 8571 8737 . - . Parent=AT1G01020.1 Chr1 TAIR10 CDS 8417 8464 . - 0 Parent=AT1G01020.1,AT1G01020.1-Protein; Chr1 TAIR10 exon 8417 8464 . - . Parent=AT1G01020.1 Chr1 TAIR10 CDS 8236 8325 . - 0 Parent=AT1G01020.1,AT1G01020.1-Protein; Chr1 TAIR10 exon 8236 8325 . - . Parent=AT1G01020.1 Chr1 TAIR10 CDS 7942 7987 . - 0 Parent=AT1G01020.1,AT1G01020.1-Protein; Chr1 TAIR10 exon 7942 7987 . - . Parent=AT1G01020.1 Chr1 TAIR10 CDS 7762 7835 . - 2 Parent=AT1G01020.1,AT1G01020.1-Protein; Chr1 TAIR10 exon 7762 7835 . - . Parent=AT1G01020.1 Chr1 TAIR10 CDS 7564 7649 . - 0 Parent=AT1G01020.1,AT1G01020.1-Protein; Chr1 TAIR10 exon 7564 7649 . - . Parent=AT1G01020.1 Chr1 TAIR10 CDS 7384 7450 . - 1 Parent=AT1G01020.1,AT1G01020.1-Protein; Chr1 TAIR10 exon 7384 7450 . - . Parent=AT1G01020.1 Chr1 TAIR10 CDS 7157 7232 . - 0 Parent=AT1G01020.1,AT1G01020.1-Protein; Chr1 TAIR10 exon 7157 7232 . - . Parent=AT1G01020.1 Chr1 TAIR10 CDS 6915 7069 . - 2 Parent=AT1G01020.1,AT1G01020.1-Protein; Chr1 TAIR10 three_prime_UTR 6437 6914 . - . Parent=AT1G01020.1 Chr1 TAIR10 exon 6437 7069 . - . Parent=AT1G01020.1 Chr1 TAIR10 three_prime_UTR 5928 6263 . - . Parent=AT1G01020.1 Chr1 TAIR10 exon 5928 6263 . - . Parent=AT1G01020.1 Chr1 TAIR10 mRNA 6790 8737 . - . ID=AT1G01020.2;Parent=AT1G01020;Name=AT1G01020.2;Index=1 Chr1 TAIR10 protein 7315 8666 . - . ID=AT1G01020.2-Protein;Name=AT1G01020.2;Derives_from=AT1G01020.2 Chr1 TAIR10 five_prime_UTR 8667 8737 . - . Parent=AT1G01020.2 Chr1 TAIR10 CDS 8571 8666 . - 0 Parent=AT1G01020.2,AT1G01020.2-Protein; Chr1 TAIR10 exon 8571 8737 . - . Parent=AT1G01020.2 Chr1 TAIR10 CDS 8417 8464 . - 0 Parent=AT1G01020.2,AT1G01020.2-Protein; Chr1 TAIR10 exon 8417 8464 . - . Parent=AT1G01020.2 Chr1 TAIR10 CDS 8236 8325 . - 0 Parent=AT1G01020.2,AT1G01020.2-Protein; Chr1 TAIR10 exon 8236 8325 . - . Parent=AT1G01020.2 Chr1 TAIR10 CDS 7942 7987 . - 0 Parent=AT1G01020.2,AT1G01020.2-Protein; Chr1 TAIR10 exon 7942 7987 . - . Parent=AT1G01020.2 Chr1 TAIR10 CDS 7762 7835 . - 2 Parent=AT1G01020.2,AT1G01020.2-Protein; Chr1 TAIR10 exon 7762 7835 . - . Parent=AT1G01020.2 Chr1 TAIR10 CDS 7564 7649 . - 0 Parent=AT1G01020.2,AT1G01020.2-Protein; Chr1 TAIR10 exon 7564 7649 . - . Parent=AT1G01020.2 Chr1 TAIR10 CDS 7315 7450 . - 1 Parent=AT1G01020.2,AT1G01020.2-Protein; Chr1 TAIR10 three_prime_UTR 7157 7314 . - . Parent=AT1G01020.2 Chr1 TAIR10 exon 7157 7450 . - . Parent=AT1G01020.2 Chr1 TAIR10 three_prime_UTR 6790 7069 . - . Parent=AT1G01020.2 Chr1 TAIR10 exon 6790 7069 . - . Parent=AT1G01020.2 ```

Braker

Braker is a popular genome annotation program

Output depends on the settings. For one of our gff file from braker 2 we get these types:

cut -f 3 output/ME49/braker/augustus.hints.gff3 | sort | uniq -c
  47527 CDS
  47527 exon
   6724 gene
  40227 intron
   7302 mRNA
   7300 start_codon
   7302 stop_codon

Note that it includes: intron, start_codon, stop_codon

See GFF3 ``` CM033580.1 AUGUSTUS gene 15529 16566 0.92 - . ID=g1; CM033580.1 AUGUSTUS mRNA 15529 16566 0.92 - . ID=g1.t1;Parent=g1; CM033580.1 AUGUSTUS stop_codon 15529 15531 . - 0 ID=g1.t1.stop1;Parent=g1.t1; CM033580.1 AUGUSTUS CDS 15529 15659 0.92 - 2 ID=g1.t1.CDS1;Parent=g1.t1; CM033580.1 AUGUSTUS exon 15529 15659 . - . ID=g1.t1.exon1;Parent=g1.t1; CM033580.1 AUGUSTUS intron 15660 16112 0.96 - . ID=g1.t1.intron1;Parent=g1.t1; CM033580.1 AUGUSTUS CDS 16113 16314 0.96 - 0 ID=g1.t1.CDS2;Parent=g1.t1; CM033580.1 AUGUSTUS exon 16113 16314 . - . ID=g1.t1.exon2;Parent=g1.t1; CM033580.1 AUGUSTUS intron 16315 16536 0.96 - . ID=g1.t1.intron2;Parent=g1.t1; CM033580.1 AUGUSTUS CDS 16537 16566 0.99 - 0 ID=g1.t1.CDS3;Parent=g1.t1; CM033580.1 AUGUSTUS exon 16537 16566 . - . ID=g1.t1.exon3;Parent=g1.t1; CM033580.1 AUGUSTUS start_codon 16564 16566 . - 0 ID=g1.t1.start1;Parent=g1.t1; CM033580.1 AUGUSTUS gene 19185 21532 0.28 - . ID=g2; CM033580.1 AUGUSTUS mRNA 19185 21532 0.28 - . ID=g2.t1;Parent=g2; CM033580.1 AUGUSTUS stop_codon 19185 19187 . - 0 ID=g2.t1.stop1;Parent=g2.t1; CM033580.1 AUGUSTUS CDS 19185 19234 0.5 - 2 ID=g2.t1.CDS1;Parent=g2.t1; CM033580.1 AUGUSTUS exon 19185 19234 . - . ID=g2.t1.exon1;Parent=g2.t1; CM033580.1 AUGUSTUS intron 19235 19334 0.5 - . ID=g2.t1.intron1;Parent=g2.t1; CM033580.1 AUGUSTUS CDS 19335 19412 0.79 - 2 ID=g2.t1.CDS2;Parent=g2.t1; CM033580.1 AUGUSTUS exon 19335 19412 . - . ID=g2.t1.exon2;Parent=g2.t1; CM033580.1 AUGUSTUS intron 19413 20696 0.74 - . ID=g2.t1.intron2;Parent=g2.t1; CM033580.1 AUGUSTUS CDS 20697 20802 0.71 - 0 ID=g2.t1.CDS3;Parent=g2.t1; CM033580.1 AUGUSTUS exon 20697 20802 . - . ID=g2.t1.exon3;Parent=g2.t1; CM033580.1 AUGUSTUS intron 20803 21349 0.83 - . ID=g2.t1.intron3;Parent=g2.t1; CM033580.1 AUGUSTUS CDS 21350 21532 0.72 - 0 ID=g2.t1.CDS4;Parent=g2.t1; CM033580.1 AUGUSTUS exon 21350 21532 . - . ID=g2.t1.exon4;Parent=g2.t1; CM033580.1 AUGUSTUS start_codon 21530 21532 . - 0 ID=g2.t1.start1;Parent=g2.t1; CM033580.1 AUGUSTUS gene 21699 25646 0.16 - . ID=g3; CM033580.1 AUGUSTUS mRNA 21699 25646 0.16 - . ID=g3.t1;Parent=g3; CM033580.1 AUGUSTUS stop_codon 21699 21701 . - 0 ID=g3.t1.stop1;Parent=g3.t1; CM033580.1 AUGUSTUS CDS 21699 21735 0.51 - 1 ID=g3.t1.CDS1;Parent=g3.t1; CM033580.1 AUGUSTUS exon 21699 21735 . - . ID=g3.t1.exon1;Parent=g3.t1; CM033580.1 AUGUSTUS intron 21736 22067 0.5 - . ID=g3.t1.intron1;Parent=g3.t1; CM033580.1 AUGUSTUS CDS 22068 22152 0.56 - 2 ID=g3.t1.CDS2;Parent=g3.t1; CM033580.1 AUGUSTUS exon 22068 22152 . - . ID=g3.t1.exon2;Parent=g3.t1; CM033580.1 AUGUSTUS intron 22153 22645 1 - . ID=g3.t1.intron2;Parent=g3.t1; CM033580.1 AUGUSTUS CDS 22646 22687 1 - 2 ID=g3.t1.CDS3;Parent=g3.t1; CM033580.1 AUGUSTUS exon 22646 22687 . - . ID=g3.t1.exon3;Parent=g3.t1; CM033580.1 AUGUSTUS intron 22688 23081 0.93 - . ID=g3.t1.intron3;Parent=g3.t1; CM033580.1 AUGUSTUS CDS 23082 23107 0.93 - 1 ID=g3.t1.CDS4;Parent=g3.t1; CM033580.1 AUGUSTUS exon 23082 23107 . - . ID=g3.t1.exon4;Parent=g3.t1; CM033580.1 AUGUSTUS intron 23108 23332 0.89 - . ID=g3.t1.intron4;Parent=g3.t1; CM033580.1 AUGUSTUS CDS 23333 23374 0.89 - 1 ID=g3.t1.CDS5;Parent=g3.t1; CM033580.1 AUGUSTUS exon 23333 23374 . - . ID=g3.t1.exon5;Parent=g3.t1; CM033580.1 AUGUSTUS intron 23375 23746 0.88 - . ID=g3.t1.intron5;Parent=g3.t1; CM033580.1 AUGUSTUS CDS 23747 23793 0.76 - 0 ID=g3.t1.CDS6;Parent=g3.t1; CM033580.1 AUGUSTUS exon 23747 23793 . - . ID=g3.t1.exon6;Parent=g3.t1; CM033580.1 AUGUSTUS intron 23794 24077 0.74 - . ID=g3.t1.intron6;Parent=g3.t1; CM033580.1 AUGUSTUS CDS 24078 24250 0.83 - 2 ID=g3.t1.CDS7;Parent=g3.t1; CM033580.1 AUGUSTUS exon 24078 24250 . - . ID=g3.t1.exon7;Parent=g3.t1; CM033580.1 AUGUSTUS intron 24251 24669 0.99 - . ID=g3.t1.intron7;Parent=g3.t1; CM033580.1 AUGUSTUS CDS 24670 24742 0.67 - 0 ID=g3.t1.CDS8;Parent=g3.t1; CM033580.1 AUGUSTUS exon 24670 24742 . - . ID=g3.t1.exon8;Parent=g3.t1; CM033580.1 AUGUSTUS intron 24743 25466 0.59 - . ID=g3.t1.intron8;Parent=g3.t1; CM033580.1 AUGUSTUS CDS 25467 25646 0.34 - 0 ID=g3.t1.CDS9;Parent=g3.t1; CM033580.1 AUGUSTUS exon 25467 25646 . - . ID=g3.t1.exon9;Parent=g3.t1; CM033580.1 AUGUSTUS start_codon 25644 25646 . - 0 ID=g3.t1.start1;Parent=g3.t1; ```

Tomato

Source: https://solgenomics.net/ftp/tomato_genome/annotation/ITAG4.0_release/ITAG4.0_gene_models.gff

There is nothing unusual here. All features have unique identifier. Genes have: CDS, exon, five_prime_UTR, gene, mRNA, three_prime_UTR

See GFF3 ``` ##gff-version 3 ##sequence-regionSL4.0ch00 1 9643250 ##sequence-regionSL4.0ch01 1 90863682 ##sequence-regionSL4.0ch02 1 53473368 ##sequence-regionSL4.0ch03 1 65298490 ##sequence-regionSL4.0ch04 1 64459972 ##sequence-regionSL4.0ch05 1 65269487 ##sequence-regionSL4.0ch06 1 47258699 ##sequence-regionSL4.0ch07 1 67883646 ##sequence-regionSL4.0ch08 1 63995357 ##sequence-regionSL4.0ch09 1 68513564 ##sequence-regionSL4.0ch10 1 64792705 ##sequence-regionSL4.0ch11 1 54379777 ##sequence-regionSL4.0ch12 1 66688036 SL4.0ch00 maker_ITAG gene 93750 94430 . + . ID=gene:Solyc00g500001.1;Alias=Solyc00g500001;Name=Solyc00g500001.1;length=680 SL4.0ch00 maker_ITAG mRNA 93750 94430 . + . ID=mRNA:Solyc00g500001.1.1;Parent=gene:Solyc00g500001.1;Name=Solyc00g500001.1.1;Note=Retrovirus-related Pol polyprotein from transposon TNT 1-94 (AHRD V3.3 *-* A0A2I0VJ33_9ASPA);_AED=0.01;_QI=0|-1|0|1|-1|0|1|0|227;_eAED=0.01 SL4.0ch00 maker_ITAG exon 93750 94430 . + . ID=exon:Solyc00g500001.1.1.1;Parent=mRNA:Solyc00g500001.1.1 SL4.0ch00 maker_ITAG CDS 93750 94430 . + 0 ID=CDS:Solyc00g500001.1.1.1;Parent=mRNA:Solyc00g500001.1.1 ### SL4.0ch00 maker_ITAG gene 305442 306257 . - . ID=gene:Solyc00g500002.1;Alias=Solyc00g500002;Name=Solyc00g500002.1;length=815 SL4.0ch00 maker_ITAG mRNA 305442 306257 . - . ID=mRNA:Solyc00g500002.1.1;Parent=gene:Solyc00g500002.1;Name=Solyc00g500002.1.1;Note=Retrovirus-related Pol polyprotein from transposon TNT 1-94 (AHRD V3.3 *-* A0A2I0VBY8_9ASPA);_AED=0.10;_QI=384|-1|0|1|-1|0|1|0|144;_eAED=0.41 SL4.0ch00 maker_ITAG CDS 305442 305873 . - 0 ID=CDS:Solyc00g500002.1.1.1;Parent=mRNA:Solyc00g500002.1.1 SL4.0ch00 maker_ITAG exon 305442 306257 . - . ID=exon:Solyc00g500002.1.1.1;Parent=mRNA:Solyc00g500002.1.1 SL4.0ch00 maker_ITAG five_prime_UTR 305874 306257 . - . ID=five_prime_UTR:Solyc00g500002.1.1.0;Parent=mRNA:Solyc00g500002.1.1 ### SL4.0ch00 maker_ITAG gene 311496 382066 . - . ID=gene:Solyc00g500003.1;Alias=Solyc00g500003;Name=Solyc00g500003.1;length=70570 SL4.0ch00 maker_ITAG mRNA 311496 382066 . - . ID=mRNA:Solyc00g500003.1.1;Parent=gene:Solyc00g500003.1;Name=Solyc00g500003.1.1;Note=MP domain-containing protein (AHRD V3.3 *-* A0A1Q3D0H5_CEPFO);_AED=0.30;_QI=0|0|0|0.16|0|0|6|0|554;_eAED=0.30 SL4.0ch00 maker_ITAG exon 311496 311570 . - . ID=exon:Solyc00g500003.1.1.1;Parent=mRNA:Solyc00g500003.1.1 SL4.0ch00 maker_ITAG CDS 311496 311570 . - 0 ID=CDS:Solyc00g500003.1.1.1;Parent=mRNA:Solyc00g500003.1.1 SL4.0ch00 maker_ITAG exon 330270 330628 . - . ID=exon:Solyc00g500003.1.1.2;Parent=mRNA:Solyc00g500003.1.1 SL4.0ch00 maker_ITAG CDS 330270 330628 . - 2 ID=CDS:Solyc00g500003.1.1.2;Parent=mRNA:Solyc00g500003.1.1 SL4.0ch00 maker_ITAG exon 344080 344133 . - . ID=exon:Solyc00g500003.1.1.3;Parent=mRNA:Solyc00g500003.1.1 SL4.0ch00 maker_ITAG CDS 344080 344133 . - 2 ID=CDS:Solyc00g500003.1.1.3;Parent=mRNA:Solyc00g500003.1.1 SL4.0ch00 maker_ITAG exon 347298 347428 . - . ID=exon:Solyc00g500003.1.1.4;Parent=mRNA:Solyc00g500003.1.1 SL4.0ch00 maker_ITAG CDS 347298 347428 . - 1 ID=CDS:Solyc00g500003.1.1.4;Parent=mRNA:Solyc00g500003.1.1 SL4.0ch00 maker_ITAG exon 351799 352644 . - . ID=exon:Solyc00g500003.1.1.5;Parent=mRNA:Solyc00g500003.1.1 SL4.0ch00 maker_ITAG CDS 351799 352644 . - 1 ID=CDS:Solyc00g500003.1.1.5;Parent=mRNA:Solyc00g500003.1.1 SL4.0ch00 maker_ITAG exon 381867 382066 . - . ID=exon:Solyc00g500003.1.1.6;Parent=mRNA:Solyc00g500003.1.1 SL4.0ch00 maker_ITAG CDS 381867 382066 . - 0 ID=CDS:Solyc00g500003.1.1.6;Parent=mRNA:Solyc00g500003.1.1 ```
garrettjstevens commented 3 weeks ago

@kyostiebi Attached here is a GFF3 that has genes in several different formats. Currently the changes that load data in the new feature model are only guaranteed to work with the first gene format in this file.

Could you update the importing code in the new feature model branch you've been working on so that it handles all the cases in the attached GFF3? All cases in this file should end up with the same gene model (just with the position offset by 10000 bases).

gene_representations.gff3.gz