WrightonLabCSU / DRAM

Distilled and Refined Annotation of Metabolism: A tool for the annotation and curation of function for microbial and viral genomes
GNU General Public License v3.0
239 stars 50 forks source link

DRAM database setup fails even after skipping Uniref #221

Closed chahatupreti closed 1 year ago

chahatupreti commented 1 year ago

Hello,

I recently attended the DRAM talk given as part of the OSU microbiome seminar series and really wanted to try it out for my set of 127 bacterial isolate assemblies (not metagenomic).

I tried to setup the database and it worked without an error but running annotation wasn't successful giving the error - KeyError: 'MER0285325'.

I looked up this github page and the best solution offered seemed like to redo setup and annotation. I have been trying to do it but every time the setup fails with errors like - subprocess.CalledProcessError: Command '['wget', '-O', 'DRAM_data/database_files/viral.1.protein.faa.gz', 'ftp://ftp.ncbi.nlm.nih.gov/refseq/release/viral/viral.1.protein.faa.gz']' returned non-zero exit status 4. and FileNotFoundError: [Errno 2] No such file or directory: '/media/chahatupreti/Raw HT Sequencing Data Backup/DRAM/DRAM_data/uniref90.20220914.mmsdb_h'

Another solution offered was to try with less threads and skipping uniref. My system has 48 threads (I have been using 46 threads for preparing the databases until this point) and 100 GB memory. So I next ran the command - DRAM-setup.py prepare_databases --output_dir DRAM_data --threads 4 --skip_uniref But stil got the error - FileNotFoundError: [Errno 2] No such file or directory: '/media/chahatupreti/Raw HT Sequencing Data Backup/DRAM/DRAM_data/uniref90.20220914.mmsdb_h'

Please suggest what could I do to resolve this. Thanks! Chahat

rmFlynn commented 1 year ago

Sory your problem is that you can't use fttp not any thing else, what version of dram are you on that uniref is giving this error?

chahatupreti commented 1 year ago

Thanks Rory. What command can I use to know the DRAM version? I looked a bit but couldn't find it. I had installed DRAM using conda last week BTW

yugen-miyahara commented 1 year ago

in your conda environment you should be able to type "conda list" and all the packages in your environment and their versions should be there

chahatupreti commented 1 year ago

Thank you! Here is my DRAM version - Name - dram Version - 1.3.5 Build - pyhdfd78af_0 Channel - bioconda

rmFlynn commented 1 year ago

Okay, there are a few things going on here. For starters because you set up with UniRef initially it still has it's location in the config. This is a known issue hopefully fixed in the next release. The easiest way to fix this is to reinstall DRAM. The second problem is that you may have an fttp block a new release of DRAM is coming with the ability to get around fttp blocks, but in the meantime you may have to download that database or several. Just follow the link and download the database. Typing DRAM-setup.py prepare_databases --help will show you the arguments to specify its location. You're looking for something like viral_loc as the flag. I'm away from the office at the moment and can't check the exact command. Here's the link. location.ftp://ftp.ncbi.nlm.nih.gov/refseq/release/viral/viral.1.protein.faa.gz

rmFlynn commented 1 year ago

You can also use DRAM-setup.py version to check the version.

rmFlynn commented 1 year ago

I am closing this because of inactivity, although you may just be occupied by other projects. You can re-open it if you have more problems and I should respond quickly!