czbiohub-sf / MIDAS

Metagenomic Intra-Species Diversity Analysis (MIDAS)
MIT License
35 stars 10 forks source link

UHGG database download error #112

Closed clairedubin closed 1 year ago

clairedubin commented 1 year ago

Hi! I am trying to download the UHGG database for all species and I am receiving an error. I already ran midas2 database --init --midasdb_name uhgg --midasdb_dir /wynton/protected/scratch/clairedubin/midasdb_uhgg with no errors.

midas2 database --download --midasdb_name uhgg  --midasdb_dir /wynton/protected/scratch/clairedubin/midasdb_uhgg --species all`
1689099642.8:    Downloading MIDAS database for sliced species 3 with 12 cores in total::start
1689099642.8:    Downloading MIDAS database for sliced species 10 with 12 cores in total::start
1689099643.0:    Downloading MIDAS database for sliced species 4 with 12 cores in total::start
1689099643.1:    Downloading MIDAS database for sliced species 2 with 12 cores in total::start
1689099643.2:    Downloading MIDAS database for sliced species 1 with 12 cores in total::start
1689099643.5:    Downloading MIDAS database for sliced species 11 with 12 cores in total::start
1689099643.5:    Downloading MIDAS database for sliced species 8 with 12 cores in total::start
1689099643.6:    Downloading MIDAS database for sliced species 0 with 12 cores in total::start
1689099643.6:    Downloading MIDAS database for sliced species 7 with 12 cores in total::start
1689099643.7:    Downloading MIDAS database for sliced species 9 with 12 cores in total::start
1689099643.7:    Downloading MIDAS database for sliced species 5 with 12 cores in total::start
1689099643.7:    Downloading MIDAS database for sliced species 6 with 12 cores in total::start

Traceback (most recent call last):
  File "/wynton/protected/home/lynchlab/clairedubin/anaconda3/envs/midas2/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/wynton/protected/home/lynchlab/clairedubin/anaconda3/envs/midas2/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/wynton/protected/home/lynchlab/clairedubin/anaconda3/envs/midas2/lib/python3.7/site-packages/midas2/__main__.py", line 28, in <module>
    main()
  File "/wynton/protected/home/lynchlab/clairedubin/anaconda3/envs/midas2/lib/python3.7/site-packages/midas2/__main__.py", line 24, in main
    return subcommand_main(subcommand_args)
  File "/wynton/protected/home/lynchlab/clairedubin/anaconda3/envs/midas2/lib/python3.7/site-packages/midas2/subcommands/database.py", line 148, in main
    download_midasdb(args)
  File "/wynton/protected/home/lynchlab/clairedubin/anaconda3/envs/midas2/lib/python3.7/site-packages/midas2/subcommands/database.py", line 37, in download_midasdb
    download_midasdb_worker(args)
  File "/wynton/protected/home/lynchlab/clairedubin/anaconda3/envs/midas2/lib/python3.7/site-packages/midas2/subcommands/database.py", line 91, in download_midasdb_worker
    midasdb.fetch_files("pangenome", species_id_list)
  File "/wynton/protected/home/lynchlab/clairedubin/anaconda3/envs/midas2/lib/python3.7/site-packages/midas2/models/midasdb.py", line 167, in fetch_files
    return self.fetch_tarball(filename, list_of_species)
  File "/wynton/protected/home/lynchlab/clairedubin/anaconda3/envs/midas2/lib/python3.7/site-packages/midas2/models/midasdb.py", line 192, in fetch_tarball
    md5_fetched = file_md5sum(_fetched_file)
  File "/wynton/protected/home/lynchlab/clairedubin/anaconda3/envs/midas2/lib/python3.7/site-packages/midas2/models/midasdb.py", line 341, in file_md5sum
    return md5(open(local_file, "rb").read()).hexdigest()
FileNotFoundError: [Errno 2] No such file or directory: '/wynton/protected/scratch/clairedubin/midasdb_uhgg/pangenomes_filtered/100007/centroids.ffn'

Here is the output of ls /wynton/protected/scratch/clairedubin/midasdb_uhgg/:

chunks        
genomes.tsv 
markers_models 
metadata.tsv
gene_annotations  
markers      
md5sum.json     
pangenomes

So there is no pangenomes_filtered directory, but there is a pangenomes directory. I didn't have this error with an older version of MIDAS2, so I'm wondering if a recent update is having an issue with directory creation or naming. I also receive the same error when attempting to download for select species instead of all species.

zhaoc1 commented 1 year ago

Hi,

Yes, I recently updated the MIDAS2 repository for the updated pangenome. However, I haven't officially updated the MIDASDB, therefore the downloading midasdb is out of date. The rest of the MIDAS2 subcommands are the same. So please use the older version of MIDAS2 (https://github.com/czbiohub-sf/MIDAS2/releases/tag/v1.0.2). Thanks.

Chunyu

zhaoc1 commented 1 year ago

Fix this issue on v1.0.7.