For some lower species there is a new issue associated with fetching UniProt IDs for proteins containing a specific Interpro ID. It appears that this comes from some very special cases where there are specific strains for the species itself. An example is with the filasterea: Capsaspora owczarzaki. Within UniProt/Interpro its scientific name is used to find records, but within the actual UniProt entries the strain name is used instead leading to no entries being fetched. In previous versions of the Interpro module the lack of entries could be circumvented using the second uniprot_dict value returned. However, these are now being returned as empty, default values (second screenshot).
Screenshots
Message regarding how many records were fetched
The returned uniprot_dict value for this specific species
Files
This is only within the Uniprot.py file and specific to the fetch_uniprotids function
To Reproduce
Steps to reproduce the behavior:
species = 'Capsaspora owczarzaki'uniprot_IDs, uniprot_dict = CoDIAC.InterPro.fetch_uniprotids(Interpro_ID, REVIEWED=False, species=species)
Expected behavior
The uniprot_dict value should still contain strain information.
Tasks
Include specific tasks in the order they need to be done in. Include links to specific lines of code where the task should happen at, if known
[ ] Review the changes associated with the if statement for the species name check in the fetch_uniprotids
[ ] Decide if there is a way to incorporate strain information to be included in the species name check or revert back to older version of the if statement
Description
For some lower species there is a new issue associated with fetching UniProt IDs for proteins containing a specific Interpro ID. It appears that this comes from some very special cases where there are specific strains for the species itself. An example is with the filasterea: Capsaspora owczarzaki. Within UniProt/Interpro its scientific name is used to find records, but within the actual UniProt entries the strain name is used instead leading to no entries being fetched. In previous versions of the Interpro module the lack of entries could be circumvented using the second uniprot_dict value returned. However, these are now being returned as empty, default values (second screenshot).
Screenshots
Message regarding how many records were fetched
The returned uniprot_dict value for this specific species
Files
This is only within the Uniprot.py file and specific to the fetch_uniprotids function
To Reproduce
Steps to reproduce the behavior:
species = 'Capsaspora owczarzaki'
uniprot_IDs, uniprot_dict = CoDIAC.InterPro.fetch_uniprotids(Interpro_ID, REVIEWED=False, species=species)
Expected behavior
The uniprot_dict value should still contain strain information.
Tasks
Include specific tasks in the order they need to be done in. Include links to specific lines of code where the task should happen at, if known