kexinhuang12345 / DeepPurpose

A Deep Learning Toolkit for DTI, Drug Property, PPI, DDI, Protein Function Prediction (Bioinformatics)
https://doi.org/10.1093/bioinformatics/btaa1005
BSD 3-Clause "New" or "Revised" License
962 stars 272 forks source link

Fix download of Binding DB database (#160) #164

Closed jeanpaulrsoucy closed 1 year ago

jeanpaulrsoucy commented 1 year ago

Here is a pull request to fix issue #160 from @parthnatekar, who pointed out issues with the function dataset.download_BindingDB.

Here is an outline of the fix, which should continue working when BindingDB is updated, provided they do not change the URL format too much:

If you run the following code using the current version of the package:

from DeepPurpose import dataset
dataset.download_BindingDB('./data/')

You will get the following error:

Beginning to download dataset...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jprs/Desktop/DeepPurpose/DeepPurpose/dataset.py", line 175, in download_BindingDB
    saved_path = wget.download(url, path)
  File "/home/jprs/Desktop/DeepPurpose/venv/lib/python3.10/site-packages/wget.py", line 526, in download
    (tmpfile, headers) = ulib.urlretrieve(binurl, tmpfile, callback)
  File "/usr/lib/python3.10/urllib/request.py", line 241, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/usr/lib/python3.10/urllib/request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.10/urllib/request.py", line 503, in open
    req = Request(fullurl, data)
  File "/usr/lib/python3.10/urllib/request.py", line 322, in __init__
    self.full_url = url
  File "/usr/lib/python3.10/urllib/request.py", line 348, in full_url
    self._parse()
  File "/usr/lib/python3.10/urllib/request.py", line 377, in _parse
    raise ValueError("unknown url type: %r" % self.full_url)
ValueError: unknown url type: '%20https%3A//www.bindingdb.org/rwd/bind/chemsearch/marvin/SDFdownload.jsp?download_file=/bind/downloads/BindingDB_All_2022m7.tsv.zip'

If you install the patched version and re-run the above code, the BindingDB database will download and unzip as expected.

kexinhuang12345 commented 1 year ago

Thank you! Merging now