EBI-Metagenomics / emg-toolkit

MGnify API toolkit
https://www.ebi.ac.uk/metagenomics
Apache License 2.0
21 stars 4 forks source link

Data not being found #11

Open JahanRahman opened 4 years ago

JahanRahman commented 4 years ago

I'm having the following problem when requesting download of certain accession ids for bulk download:

0%| | 0/9 [00:00<?, ?it/sERROR: HTTP Error 404: Not Found | 0/9 [00:00<?, ?it/s] 0%| | 0/9 [00:00<?, ?it/s] 0%| | 0/9 [00:00<?, ?it/s] Traceback (most recent call last): File "/nfs/sw/ebi-metagenomics/ebi-metagenomics-0.6.5/python/bin/mg-toolkit", line 8, in sys.exit(main()) File "/nfs/sw/ebi-metagenomics/ebi-metagenomics-0.6.5/python/lib/python3.8/site-packages/mg_toolkit/init.py", line 198, in main return getattr(mg_toolkit, args.tool)(args) File "/nfs/sw/ebi-metagenomics/ebi-metagenomics-0.6.5/python/lib/python3.8/site-packages/mg_toolkit/bulk_download.py", line 44, in bulk_download program.run() File "/nfs/sw/ebi-metagenomics/ebi-metagenomics-0.6.5/python/lib/python3.8/site-packages/mg_toolkit/bulk_download.py", line 213, in run num_results_processed = self._process_page(res, progress_bar) File "/nfs/sw/ebi-metagenomics/ebi-metagenomics-0.6.5/python/lib/python3.8/site-packages/mg_toolkit/bulk_download.py", line 253, in _process_page self.download_file( File "/nfs/sw/ebi-metagenomics/ebi-metagenomics-0.6.5/python/lib/python3.8/site-packages/mg_toolkit/bulk_download.py", line 163, in download_file BulkDownloader.download_resource_by_url( File "/nfs/sw/ebi-metagenomics/ebi-metagenomics-0.6.5/python/lib/python3.8/site-packages/mg_toolkit/bulk_download.py", line 125, in download_resource_by_url urlretrieve(url, output_file_name) File "/nfs/sw/python/python-3.8.3/lib/python3.8/urllib/request.py", line 247, in urlretrieve with contextlib.closing(urlopen(url, data)) as fp: File "/nfs/sw/python/python-3.8.3/lib/python3.8/urllib/request.py", line 222, in urlopen return opener.open(url, data, timeout) File "/nfs/sw/python/python-3.8.3/lib/python3.8/urllib/request.py", line 531, in open response = meth(req, response) File "/nfs/sw/python/python-3.8.3/lib/python3.8/urllib/request.py", line 640, in http_response response = self.parent.error( File "/nfs/sw/python/python-3.8.3/lib/python3.8/urllib/request.py", line 569, in error return self._call_chain(args) File "/nfs/sw/python/python-3.8.3/lib/python3.8/urllib/request.py", line 502, in _call_chain result = func(args) File "/nfs/sw/python/python-3.8.3/lib/python3.8/urllib/request.py", line 649, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 404: Not Found

mberacochea commented 4 years ago

Thank you letting us know about the problem. There is a new version of the toolkit (0.7.0), this version won't stop if there is a missing file.

I can confirm that some of the files on that analysis are missing, even though they are listed by the API. I'll look into this and get back to you.