Fixes a swathe of issues around gene and accession identifiers.
Changes extra_gene annotations to require a location
Changes fields expecting gene identifiers to have the same patterns as gene identifiers
Changes NCBI accession to require version number
Removes extra_gene annotations that duplicated genes in the referenced accessions (may not be exhaustive)
Any extra genes that were no longer valid were removed, except in cases where the extra genes were clearly in a different contig/record, in which case the entry was retired.
Gene identifiers in the gene lists that weren't matching genes in the record were removed (though there may still be more), and annotations in those lists which weren't gene identifiers were removed.
Any class annotation that had a completely invalid gene listing where a gene list was a required field was removed completely.
Fixes a swathe of issues around gene and accession identifiers.
extra_gene
annotations to require a locationextra_gene
annotations that duplicated genes in the referenced accessions (may not be exhaustive)Any extra genes that were no longer valid were removed, except in cases where the extra genes were clearly in a different contig/record, in which case the entry was retired. Gene identifiers in the gene lists that weren't matching genes in the record were removed (though there may still be more), and annotations in those lists which weren't gene identifiers were removed.
Any class annotation that had a completely invalid gene listing where a gene list was a required field was removed completely.