I use GFF3 as the primary export format for hundreds of prokaryotic and eukaryotic genomes and, while the structure is generally well defined in the specification for coding genes, it would be great to have some clarifications and even best practices for standards purposes in a future release. Considerations include:
Non-coding gene encoding in GFF. This should include examples of tRNAs, rRNAs, etc. What does the gene graph look like for these?
Functional annotation standards. In the 9th column, can we decide some standardized keys for things like gene product names and gene symbols. Others, such as GO terms and EC numbers are already well described using Dbxrefs, but even these could be expanded to allow for attribution of sources of these terms as well as GO evidence codes.
Without some of these being formally in the specification it allows for competing standards from the large organizations, such as EMBL and now NCBI's support for GFF3.
I use GFF3 as the primary export format for hundreds of prokaryotic and eukaryotic genomes and, while the structure is generally well defined in the specification for coding genes, it would be great to have some clarifications and even best practices for standards purposes in a future release. Considerations include:
Without some of these being formally in the specification it allows for competing standards from the large organizations, such as EMBL and now NCBI's support for GFF3.