Open photocyte opened 8 months ago
GA4GH may be a better forum for this discussion and organizing sponsorship. They're currently managing BED, CRAM, VCF, and SAM/BAM. While the GFF3 specification has been published on SO's github for a while now, it would make sense for it to be formalized and published on GA4GH's site along with the other bioinformatics standards. GA4GH has the resources to facilitate discussion and provide long-term stewardship of file format specifications.
I represent SO in the Sequence Annotation study group (part of the GA4GH Genomic Knowledge Work Stream). If you like, I can ping someone at GA4GH secretariat to see if this is something that would fall within GA4GH's scope. If GA4GH isn't vetted as a standards organization at IANA yet we may want to take care of that first.
Thanks @egchristensen ! It’d be great if you might ping the GA4GH secretariat. I have some IANA templates filled out & not submitted if that would help. I don’t have a strong want to be the submitter but just was surprised it hadn’t been done yet.
edit: thanks Eric for sending that email. I'm making a note here to link out to the email thread (it keeps it confidential, it's just a way for me to easily access it in my own email client): https://hookmark.net/hm/hook/email/BYAPR11MB2854070C78C7440AA0C73ED5FEFA2%40BYAPR11MB2854.namprd11.prod.outlook.com
Continuation of this email thread with keilbeck&genetics.utah.edu & evan.christensen&utah.edu, from tfallon&ucsd.edu. That thread resolved to give a shot at SO sponsored submission of bioinformatics file formats to the IANA Media Types ("MIME") registry.
This thread can be used for public comment on SO sponsored submission of bioinformatics file formats. Private comment can be sent to the above email addresses.
This issue comment is currently a stub for the targeted file formats for submission. I will keep editing the comment expanding the table & ping folks once it is done.
To my understanding, SO can speak on behalf i.e. submit non-SO "owned" file formats to IANA Media Type registry. I took a look through the IANA procedures, most recently spelled out in RFC6838: https://www.rfc-editor.org/rfc/rfc6838.html
Per that RFC, in order for a Media Type (the current name for the MIME types), to be accepted in the IANA registry in the “Standards Tree”, and thus without an additional prefix to the media type name (vnd. for the “Vendor tree”, prs. for the “Personal tree”), it has to be submitted by a “Standards Organization”.
Seeing as gff3 was previously submitted by SO and accepted, I assume that means SO is an accepted standards organization in IANA’s eyes. Since gff3 is the only bioinformatics file format that was submitted to IANA, currently SO is the only vetted bioinformatics-related standards organization for this process. It’s preferred in the RFC that the submitting standards organization “owns" the file format, but is not required. There are procedures in the RFC to resolve ownership in case of a dispute.