genometools / genometools

GenomeTools genome analysis system.
284 stars 65 forks source link

gff3validator "Sequence Ontology" out of date? #1019

Closed rzelle-lallemand closed 1 year ago

rzelle-lallemand commented 1 year ago

Problem description

The support team of the Saccharomyces Genome Database just wrote to me: "Our current GFF3 file can be found here:".

I tried to validate this GFF with, but am getting the validation error:

Validation unsuccessful!

GenomeTools error: type "uORF" on line 164 in file "/var/www/servers/" is not a valid one

However, it looks like this is a valid Sequence Ontology term that was added in 2014:

The gff3validator webpage also says "Last update: 2015-01-25", while the Sequence Ontology has received updates since then (although they don't seem to have (many) versioned releases: ). Could the online gff3validator be updated with a current copy of the Sequence Ontology?

Exact command line call triggering the problem


Example minimal input triggering the problem

What GenomeTools version are you reporting an issue for (as output by gt -version)?

GFF3 online validator Last update: 2015-01-25

Did you compile GenomeTools from source? If so, please state the make parameters used.


What operating system (e.g. Ubuntu, Mac OS X), OS version (e.g. 15.10, 10.11) and platform (e.g. x86_64) are you using?


satta commented 1 year ago

You're right, the OBO file on the webserver hasn't been updated in a while; it doesn't have this type yet:

$ fgrep uORF genometools_for_web/gtdata/obo_files/so.obo            

Should be easy to update though, since the current GenomeTools distribution from git already has a newer version of so.obo that includes this type:

$ fgrep uORF gtdata/obo_files/so.obo
name: uORF
synonym: "regulatory uORF" EXACT []
name: AUG_initiated_uORF
def: "A uORF beginning with the canonical start codon AUG." [PMID:26684391, PMID:27313038]
synonym: "AUG initiated uORF" EXACT []
is_a: SO:0002027 ! uORF
name: non_AUG_initiated_uORF
def: "A uORF beginning with a codon other than AUG." [PMID:26684391, PMID:27313038]
synonym: "non AUG initiated uORF" EXACT []
is_a: SO:0002027 ! uORF

When trying to validate your file against this more recent version (with the standalone validator), I get a new issue:

$ ./bin/gt gff3validator -typecheck gtdata/obo_files/so.obo ~/Downloads/saccharomyces_cerevisiae.20230315.gff
./bin/gt gff3validator: error: the child feature with type 'transposable_element' on line 401 in file "/home/satta/Downloads/saccharomyces_cerevisiae.20230315.gff" is not part-of parent feature with type 'transposable_element_gene' given on line 399 (according to type checker 'OBO file gtdata/obo_files/so.obo')

which is correct since the part-of relationship is swapped in that situation:

chrI    SGD     transposable_element_gene       160597  164187  .       -       .       ID=YAR009C;Name=YAR009C;Alias=YARCTyB1-1,truncated%20gag-pol%20fusion%20protein;Ontology_term=GO:0000943,GO:0003723,GO:0003887,GO:0003964,GO:0004540,GO:0005634,GO:0005737,GO:0008233,GO:0032197,SO:0000704;Note=Retrotransposon%20TYA%20Gag%20and%20TYB%20Pol%20genes%3B%20Gag%20processing%20produces%20capsid%20proteins%2C%20Pol%20is%20cleaved%20to%20produce%20protease%2C%20reverse%20transcriptase%20and%20integrase%20activities%3B%20in%20YARCTy1-1%20TYB%20is%20mutant%20and%20probably%20non-functional%3B%20protein%20product%20forms%20cytoplasmic%20foci%20upon%20DNA%20replication%20stress;display=Retrotransposon%20TYA%20Gag%20and%20TYB%20Pol%20genes;dbxref=SGD:S000000067;curie=SGD:S000000067
chrI    SGD     transposable_element    160597  164187  .       -       .       ID=YAR009C_transposable_element;Name=YAR009C_transposable_element;Parent=YAR009C

The _transposable_elementgene should be part-of the _transposableelement, not the other way around. See

I will update the so.obo file on the webserver soon and would please ask you to use the standalone validator in the meantime. Thanks for letting us know!

rzelle-lallemand commented 1 year ago

I will update the so.obo file on the webserver soon

Thanks, also for the additional sleuthing into this GFF!

satta commented 1 year ago

This is done now, the validator on the website now runs with a more recent SO version.