legumeinfo / datastore-specifications

Specifications for directory naming, file naming, file contents in the LIS datastore
2 stars 0 forks source link

RFO: annotation GFFs must contain Note: attribute with GO terms to be loaded into mines #42

Closed sammyjava closed 1 year ago

sammyjava commented 1 year ago

So Rex discovered that Fiskeby lacked gene descriptions, then we discovered that the current Datastore GFF contains them, but they're not AHRD-generated with GO terms. This means the mine will not associate GO terms with any of those genes. This is Bad.

I propose that the MINE LOADER (not the YAML validator) require that GO terms be present in an annotation GFF. This will alert me to task @adf-ncgr with generating a new GFF and we won't get genes in the mines that don't have any functional information.

This seemed like the appropriate repo to post this since it is a requirement on Datastore content if you want to get an annotation into a mine. But you probably won't be alerted until the next mine build, which is fine since presumably the annotation is newer than the previous mine build.

sammyjava commented 1 year ago

This is now enforced by the validation that I do when loading annotations into the mines.,