cokelaer / bioservices

Access to Biological Web Services from Python.
http://bioservices.readthedocs.io
Other
278 stars 60 forks source link

Fix issue with compress in uniprot search. #249

Closed ArnaudBelcour closed 1 year ago

ArnaudBelcour commented 1 year ago

Hi,

I was trying to download compressed fasta from UniProt using bioservices search. But this function did not return compressed result, instead it returns an uncompressed fasta string.

An example:

import bioservices
uniprot_bioservices = bioservices.UniProt(verbose=False)
data = uniprot_bioservices.search('P57263', database='uniprot', frmt='fasta', compress=True)

Returns:

0it [00:00, ?it/s]
>sp|P57263|NUOM_BUCAI NADH-quinone oxidoreductase subunit M OS=Buchnera aphidicola subsp. Acyrthosiphon pisum (strain APS) OX=107806 GN=nuoM PE=3 SV=1
MLLSLLIIIPFLSSFFSFFSPRLHNNFPRWIALSGIIATLLVVIQIFFQENYHIFQIRHY
PNWNCQLIVPWISRFGIEFNIALDGLSIIMLIFSSFLSIIAIICSWNEIKKNEGFFYFNF
MLVFTGIIGVFISCDLFLFFCFWEIMLIPMYFLIALWSDKTEKKKNFLAANKFFLYSQTS
GLILLSSILLLVFSHYYSTNILTFNYNLLINKPINIYVEYIVMIGFFLSFAIKMPIVPFH
GWLPDIHSRSLSCGSVEIIGVLLKTAPYALLRYNLVLFPDSTKSFSLIAVFWGIISIFYG
AWIAFSQTNIKRLIAYSSVSHMGLILIGIYSNNERALQGVVIQMLSNSLTVAALCILSGQ
IYKRFKTQDMSKMGGLWSCIYWIPGFSLFFSLANLGVPGTGNFIGEFLILSGVFEVFPLV
SILATIGIVFSSIYSLNVIQKIFYGPCKQNIKVFFINKQEVWTIIALVFTLVFLGLNPQK
IIDVSYNSIHNIQKEFNNSILKIRS

It is caused by wrong parameter and value in the search function. This Pull request should fix this issue.

Do you think that the variable compress of search function should be renamed to match compressed? I think not as it breaks compatibility with script using this function but as the other parameters seem to be matching the ones of Uniprot, I prefer asking this.

cokelaer commented 1 year ago

@ArnaudBelcour thanks for the fix. I have merged your PR. as for your question, indeed, could be renamed compressed but for back compatibility, as you mentionned, I will keep 'compress' for now. The uniprot variable is indeed called compress as well. I tend to keep the name of the services except if it is too esoteric and not meaningful.