This PR doesn't change any of the resource files but it adds support for adding extra organism labels for protein entries in a user-configurable way. For instance, passing 10239 (the taxonomy ID for "Viruses") to the update_uniprot_proteins.py script adds Viruses as an extra organism label for all viral proteins. This then allows adding Viruses to ner_kb.config to include all viral proteins in NER.
I am not changing the actual resource files because the inclusion of organism-specific synonyms is use-case specific so I don't think the "official" release of bioresources should make any specific additions. But these features are useful for custom local builds.
This PR doesn't change any of the resource files but it adds support for adding extra organism labels for protein entries in a user-configurable way. For instance, passing
10239
(the taxonomy ID for "Viruses") to theupdate_uniprot_proteins.py
script addsViruses
as an extra organism label for all viral proteins. This then allows addingViruses
toner_kb.config
to include all viral proteins in NER.I am not changing the actual resource files because the inclusion of organism-specific synonyms is use-case specific so I don't think the "official" release of bioresources should make any specific additions. But these features are useful for custom local builds.