Open ywangbioinfo opened 3 years ago
I had a similar problem with step 9 until I used aspera. It appears that the aspera download sorts the downloads of step 9 into folders (bacteria and viral) while the wget (--no_aspera) does not. not sure if this is causing the problem or not...
I had the same problem so I'm trying something else. I hope it helps.
Basically, you can edit the scripts and comment out the steps giving you problems / that you don't need. So for step 9:
1) find the source file
whereis microbeannotator_db_builder
>> /usr/local/bin/microbeannotator_db_builder
gedit /usr/local/bin/microbeannotator_db_builder
2) comment out the refseq step at line 171
# Download RefSeq Proteins
if step == 9:
logger.info(f"Step 9")
# if aspera:
# refseq_prot = refseq.refseq_fasta_downloader(database_directory)
# else:
# refseq_prot = refseq.refseq_fasta_downloader_wget(
# database_directory, threads)
# database_files['RefSeq_Fasta'] = str(refseq_prot)
if single_step:
step = 15
else:
step += 1
3) save the following code as a short .py script and run it separately (if it doesn't work either, you can do it manually and store a merged "refseq_protein.fasta" file it in
from microbeannotator.database import refseq_data_downloader as r
db = <database_directory(-d flag for microbeannotator_db_builder>
r.refseq_fasta_downloader_wget(output_file_folder=db,threads=1)
If it doesn't work either, you can do it manually and store a merged "refseq_protein.fasta" file it in
wrong window
Dear MicrobeAnnotator developer,
Today, when I created MicrobeAnnotator database, I got trouble at step 9. Downloading always stopped as follows. I am waiting for your advice.
$ microbeannotator_db_builder -d MicrobeAnnotator_DB -m diamond --step 9 --no_aspera 2021-08-05 16:05:18,015 [INFO]: This is MicrobeAnnotator v2.0.4 2021-08-05 16:05:18,016 [INFO]: I will download and format the databases I use. 2021-08-05 16:05:18,016 [INFO]: Creating database folders 2021-08-05 16:05:18,016 [INFO]: Step 9 2021-08-05 16:05:18,016 [INFO]: Downloading protein fasta files using wget. 100% [........................................................] 18619784 / 18619784multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/urllib/request.py", line 1573, in ftp_open fp, retrlen = fw.retrfile(file, type) File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/urllib/request.py", line 2437, in retrfile conn, retrlen = self.ftp.ntransfercmd(cmd) File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/ftplib.py", line 361, in ntransfercmd source_address=self.source_address) File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/socket.py", line 728, in create_connection raise err File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/socket.py", line 716, in create_connection sock.connect(sa) TimeoutError: [Errno 110] Connection timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/multiprocessing/pool.py", line 121, in worker result = (True, func(*args, *kwds)) File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar return list(map(args)) File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/site-packages/microbeannotator/database/refseq_data_downloader.py", line 267, in refseq_multiprocess_downloader wget.download(file_url, out=output) File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/site-packages/wget.py", line 526, in download (tmpfile, headers) = ulib.urlretrieve(binurl, tmpfile, callback) File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/urllib/request.py", line 247, in urlretrieve with contextlib.closing(urlopen(url, data)) as fp: File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/urllib/request.py", line 222, in urlopen return opener.open(url, data, timeout) File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/urllib/request.py", line 525, in open response = self._open(req, data) File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/urllib/request.py", line 543, in _open '_open', req) File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/urllib/request.py", line 503, in _call_chain result = func(*args) File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/urllib/request.py", line 1584, in ftp_open raise exc.with_traceback(sys.exc_info()[2]) File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/urllib/request.py", line 1573, in ftp_open fp, retrlen = fw.retrfile(file, type) File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/urllib/request.py", line 2437, in retrfile conn, retrlen = self.ftp.ntransfercmd(cmd) File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/ftplib.py", line 361, in ntransfercmd source_address=self.source_address) File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/socket.py", line 728, in create_connection raise err File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/socket.py", line 716, in create_connection sock.connect(sa) urllib.error.URLError: <urlopen error ftp error: TimeoutError(110, 'Connection timed out')> """
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/ubuntu20/miniconda3/envs/microbeannotator/bin/microbeannotator_db_builder", line 445, in
main()
File "/home/ubuntu20/miniconda3/envs/microbeannotator/bin/microbeannotator_db_builder", line 437, in main
single_step, aspera, keep_temp, bin_path)
File "/home/ubuntu20/miniconda3/envs/microbeannotator/bin/microbeannotator_db_builder", line 178, in database_duilder
database_directory, threads)
File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/site-packages/microbeannotator/database/refseq_data_downloader.py", line 151, in refseq_fasta_downloader_wget
pool.map(refseq_multiprocess_downloader, file_list)
File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/multiprocessing/pool.py", line 268, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/home/ubuntu20/miniconda3/envs/microbeannotator/lib/python3.7/multiprocessing/pool.py", line 657, in get
raise self._value
urllib.error.URLError: <urlopen error ftp error: TimeoutError(110, 'Connection timed out')>