Create option to export GFF3 to comply with requirements at NCBI

monicacecilia commented 9 years ago

These are the GFF3 formatting requirements provided by Terence Murphy from NCBI. Before submitting the official gene set (OGS), that is, the integrated GFF of predicted and manually curated models, some attributes need to be added:

Add locus_tag attribute to top-level features such as gene or pseudogene (e.g., locus_tag=W904_OFAS000001; where W904 is the species accession number used in the NCBI submission system). The locus_tag prefix is generated when a BioProject is created, as shown here: http://www.ncbi.nlm.nih.gov/bioproject/230921
Add transcript_id and protein_id attributes to both mRNA and CDS features (e.g., transcript_id=OFAS000001-RA;protein_id=OFAS000001-PA). Note: add only transcript_id to transcripts that are not from coding genes (e.g., pseudogenic_transcript, rRNA)
Add a product attribute to CDS features (e.g., product=prophenoloxidase); this is usually the mRNA name when the name is different from ID.

Adapted from email sent by Mei-Ju Chen at USDA/NAL. Mei-Ju's request: "It will be great if WA team could help to batch processing some of the attributes. Let me know if you have questions."

monicacecilia commented 9 years ago

@nathandunn you or @deepakunni3 ? I assigned this one to you to bring it back to the spotlight. cheers,

nathandunn commented 9 years ago

Should this be the default for exporting GFF3 . . or should this be another option?

monicacecilia commented 9 years ago

For now, I think it should be another option called "GFF3 for NCBI" or something similar. We may want to incorporate this permanently later on, but I don't know how many people are using the output on their pipelines, so we should make an announcement before changing it for good.

deepakunni3 commented 9 years ago

This is interesting. Maybe I can look into this.

nathandunn commented 9 years ago

@deepakunni3 Sure. @monicacecilia Okay . an option makes the most sense for now.

monicacecilia commented 9 years ago

Si @deepakunni3! cheers,

deepakunni3 commented 9 years ago

:+1:

GMOD / Apollo

Create option to export GFF3 to comply with requirements at NCBI #565