legumeinfo / mine-issues

Report ALL issues on LIS mines here! Regardless of which mine you found it on!
2 stars 0 forks source link

Require genes exist and gene descriptions exist and contain GO terms #118

Closed sammyjava closed 1 year ago

sammyjava commented 1 year ago

Update AnnotationCollectionValidator to impose these rules.

sammyjava commented 1 year ago

Done.

sammyjava commented 1 year ago

Example collection that now fails validatio:

## Validating cerca collection ISC453364.gnm1.ann1.HZJM
 - cerca.ISC453364.gnm1.ann1.HZJM.gene_models_main.gff3.gz
10:39:49 [main] INFO  org.biojava.nbio.genome.parsers.gff.GFF3Reader - Reading: /tmp/temp.gff3
## INVALID: cerca.ISC453364.gnm1.ann1.HZJM.gene_models_main.gff3.gz gene records do not contain the Note attribute.
 - cerca.ISC453364.gnm1.ann1.HZJM.protein.faa.gz
 - cerca.ISC453364.gnm1.ann1.HZJM.cds.fna.gz
 - cerca.ISC453364.gnm1.ann1.HZJM.mrna.fna.gz
 x optional iprscan.gff3.gz file is not present.
 - cerca.ISC453364.gnm1.ann1.HZJM.legfed_v1_0.M65K.gfa.tsv.gz
 x optional pathway.tsv.gz file is not present.
 x optional phytozome_10_2.HFNR.gfa.tsv.gz file is not present.