Open djow2019 opened 6 years ago
Also, as we discussed, you'll be loading new taxons with proteins with no uniprot IDs. I think this would only impact the GO Annotation bot (https://github.com/SuLab/scheduled-bots/blob/master/scheduled_bots/geneprotein/GOBot.py). However, if they have no annotations in Quickgo, it won't matter. And the interpro bot (https://github.com/SuLab/scheduled-bots/blob/master/scheduled_bots/interpro/ProteinBot.py), but again, the annotations are by uniprot ID, so if no uniprot ID, no annotations either. I'll look into this once the new taxon is loaded.
Due to NCBI's ref seq reannotation project, many ref seq ids are no longer unique to specific organisms. NCBI is combining proteins with identical structure to a single reference id, so it will no longer be possible to query a protein using only the ref seq protein ID. Two workarounds: 1) use uniprot IDs (which are still unique), or 2) combine ref seq protein ID with tax ID. More information, see here https://www.wikidata.org/wiki/Property_talk:P637#Remove_Distinct_Value_Constraint.