cokelaer / bioservices

Access to Biological Web Services from Python.
http://bioservices.readthedocs.io
Other
278 stars 60 forks source link

UniProt quick example is not working #224

Closed widdowquinn closed 1 year ago

widdowquinn commented 2 years ago

While investigating an error in ncfp I came across a problem stemming from bioservices. Following the "Quick Example" in README.md, I prepared a short Python script called test_bioservices.py, containing:

from bioservices import UniProt
u = UniProt(verbose=False)
data = u.search("zap70+and+taxonomy:9606", frmt="tab", limit=3, columns="entry name,length,id,genes")
print(f"{data=}")

Executing this gives the error:

% python test_bioservices.py                                                                                           [5:03:31]
Traceback (most recent call last):
  File "/Users/lpritc/Documents/Development/GitHub/ncfp/issue_35/test_bioservices.py", line 3, in <module>
    data = u.search("zap70+and+taxonomy:9606", frmt="tab", limit=3, columns="entry name,length,id,genes")
  File "/opt/anaconda3/envs/ncfp_py39/lib/python3.9/site-packages/bioservices/uniprot.py", line 664, in search
    self.services.devtools.check_param_in_list(frmt, _valid_formats)
  File "/opt/anaconda3/envs/ncfp_py39/lib/python3.9/site-packages/easydev/tools.py", line 335, in check_param_in_list
    check_param_in_list(name, list(valid_values))
  File "/opt/anaconda3/envs/ncfp_py39/lib/python3.9/site-packages/easydev/tools.py", line 112, in check_param_in_list
    raise ValueError(msg)
ValueError: Incorrect value provided (tab)    Correct values are ['xls', 'fasta', 'gff', 'txt', 'tsv', 'xml', 'rss', 'list', 'rss', 'html']

Fixing the frmt argument (tsv in place of tab) gives the following error:

% python test_bioservices.py                                                                                           [5:04:07]
WARNING [bioservices.UniProt:596]:  status is not ok with Bad Request
data=400

matching the errors I was seeing in ncfp.

Version information is below:

python --version                                                                                                     [5:04:42]
Python 3.9.13
>>> bioservices.version
'1.10.0'

I think the taxonomy field in the example query needs to be modified to taxonomy_id, as per https://www.uniprot.org/help/query-fields:

from bioservices.uniprot import UniProt
u = UniProt(verbose=False)

print("Trying u.mapping")
u.mapping("UniProtKB_AC-ID", "KEGG", "P43403,P123456")

print("Trying u.search with queries")
queries = ["P43403", "ZAP70", "ZAP70_HUMAN", "zap70+AND+organism:9606", "human AND antigen",
           "human+AND+antigen", "zap70 and human", "organism:9606",
           "accession:P62988", "organism_id:9606", "zap70+and+organism_id:9606"]
for query in queries:
    print(f"\t{query=}")
    u.search(query, columns="id")

gives

% python test_bioservices.py                                                                                           [5:28:23]
Trying u.mapping
Trying u.search with queries
    query='P43403'
    query='ZAP70'
    query='ZAP70_HUMAN'
    query='zap70+AND+organism:9606'
WARNING [bioservices.UniProt:596]:  status is not ok with Bad Request
    query='human AND antigen'
    query='human+AND+antigen'
    query='zap70 and human'
    query='organism:9606'
WARNING [bioservices.UniProt:596]:  status is not ok with Bad Request
    query='accession:P62988'
    query='organism_id:9606'
    query='zap70+and+organism_id:9606'
cokelaer commented 2 years ago

@widdowquinn Uniprot has changed its API recently. I have updated bioservices but not the documentation.

It is now updated on bioservices.readthedocs.io

Unfortunately, the uniprot API has changed a lot for the best (json support for instance). However, this means that previous code will need to be changed accordingly. I tried to keep track of the changed to a minimum level.

as for your example, indeed taxonomy should be taxonomy_id and organism should be organism_id

cokelaer commented 1 year ago

@widdowquinn I have now fixed the documentation that should be updated