Closed yulong0827 closed 1 year ago
This is failing because there was an error when you ran DRAM-setup.py. This may have happened because NCBI refseq FTP was down or maybe the server you are running on lost internet connect for some reason when you tried to run DRAM-setup.py. I would recommend trying to run the first command again and see if it finishes without an error this time. If that happens then annotate should work.
This is failing because there was an error when you ran DRAM-setup.py. This may have happened because NCBI refseq FTP was down or maybe the server you are running on lost internet connect for some reason when you tried to run DRAM-setup.py. I would recommend trying to run the first command again and see if it finishes without an error this time. If that happens then annotate should work. Thank you very much for your reply. However, i tried again and got the same error:subprocess.CalledProcessError: Command '['wget', '-O', 'DRAM_data/database_files/viral.1.protein.faa.gz', 'ftp://ftp.ncbi.nlm.nih.gov/refseq/release/viral/viral.1.protein.faa.gz']' returned non-zero exit status 4. So i wonder if there is any alternative way for db preparation? perhaps it is a web caused error?
Can you try running that command directly and see if it works? The command is wget -O DRAM_data/database_files/viral.1.protein.faa.gz ftp://ftp.ncbi.nlm.nih.gov/refseq/release/viral/viral.1.protein.faa.gz
.
Hi all, I'm having a similar issue that hasn't been resolved. I've tried running DRAM-setup.py twice with this batch command:
#!/bin/bash
##### Constructed by HPC everywhere #####
#SBATCH --mail-user=********@********.***
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=16
#SBATCH --time=12:00:00
#SBATCH --mem=500gb
#SBATCH --partition=largememory
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --job-name=DRAM-database
#SBATCH --output=DRAM-database.out
#SBATCH --error=DRAM-database.err
###### Module commands #####
module load anaconda/python3.8/2020.07
###### Job commands go below this line #####
cd /N/slate/jakosmo
source activate DRAM
DRAM-setup.py prepare_databases --output_dir /N/slate/jakosmo/DRAM_data
DRAM-setup.py print_config
And the output text file displays this:
2022-01-23 10:34:07.057045: Database preparation started
Processed search databases
KEGG db: None
KOfam db: /N/slate/jakosmo/DRAM_data/kofam_profiles.hmm
KOfam KO list: /N/slate/jakosmo/DRAM_data/kofam_ko_list.tsv
UniRef db: /N/slate/jakosmo/DRAM_data/uniref90.20220122.mmsdb
Pfam db: /N/slate/jakosmo/DRAM_data/pfam.mmspro
dbCAN db: /N/slate/jakosmo/DRAM_data/dbCAN-HMMdb-V9.txt
RefSeq Viral db: /N/slate/jakosmo/DRAM_data/refseq_viral.20220123.mmsdb
MEROPS peptidase db: /N/slate/jakosmo/DRAM_data/peptidases.20220123.mmsdb
VOGDB db: /N/slate/jakosmo/DRAM_data/vog_latest_hmms.txt
Descriptions of search database entries
Pfam hmm dat: /N/slate/jakosmo/DRAM_data/Pfam-A.hmm.dat.gz
dbCAN family activities: /N/slate/jakosmo/DRAM_data/CAZyDB.07302020.fam-activities.txt
VOG annotations: /N/slate/jakosmo/DRAM_data/vog_annotations_latest.tsv.gz
Description db: /N/slate/jakosmo/DRAM_data/description_db.sqlite
DRAM distillation sheets
Genome summary form: /N/slate/jakosmo/DRAM_data/genome_summary_form.20220123.tsv
Module step form: /N/slate/jakosmo/DRAM_data/module_step_form.20220123.tsv
ETC module database: /N/slate/jakosmo/DRAM_data/etc_mdoule_database.20220123.tsv
Function heatmap form: /N/slate/jakosmo/DRAM_data/function_heatmap_form.20220123.tsv
AMG database: /N/slate/jakosmo/DRAM_data/amg_database.20220123.tsv
However, I am getting this error in the error file which is preventing me from successfully being able to run DRAM-v.py:
Traceback (most recent call last):
File "/N/u/jakosmo/Carbonate/.conda/envs/DRAM/bin/DRAM-setup.py", line 157, in <module>
args.func(**args_dict)
File "/N/u/jakosmo/Carbonate/.conda/envs/DRAM/lib/python3.9/site-packages/mag_annotator/database_processing.py", line 289, in prepare_databases
output_dbs['uniref_db_loc'] = download_and_process_uniref(uniref_loc, temporary, uniref_version=uniref_version,
File "/N/u/jakosmo/Carbonate/.conda/envs/DRAM/lib/python3.9/site-packages/mag_annotator/database_processing.py", line 91, in download_and_process_uniref
make_mmseqs_db(uniref_fasta_zipped, uniref_mmseqs_db, create_index=True, threads=threads, verbose=verbose)
File "/N/u/jakosmo/Carbonate/.conda/envs/DRAM/lib/python3.9/site-packages/mag_annotator/utils.py", line 38, in make_mmseqs_db
run_process(['mmseqs', 'createindex', output_loc, tmp_dir, '--threads', str(threads)], verbose=verbose)
File "/N/u/jakosmo/Carbonate/.conda/envs/DRAM/lib/python3.9/site-packages/mag_annotator/utils.py", line 27, in run_process
return subprocess.run(command, check=check, shell=shell, stdout=subprocess.PIPE,
File "/N/u/jakosmo/Carbonate/.conda/envs/DRAM/lib/python3.9/subprocess.py", line 528, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['mmseqs', 'createindex', '/N/slate/jakosmo/DRAM_data/database_files/uniref90.20220123.mmsdb', '/N/slate/jakosmo/DRAM_data/database_files/tmp', '--threads', '10']' returned non-zero exit status 1.
I don't know if manually downloading the viral protein database from refseq as mentioned above helped @yulong0827 , but would trying the same thing with UniRef help? How would I go about that? Or is there perhaps another solution? Thanks in advance, please advise.
@jamesck2 In the same directory as where you are running dram try running that command mmseqs createindex /N/slate/jakosmo/DRAM_data/database_files/uniref90.20220123.mmsdb /N/slate/jakosmo/DRAM_data/database_files/tmp --threads 10
See what errors it spits out.
Closing because of inactivity
Thanks first. this is the database setup. So, uniref, pfam and dbcan is ok for use? DRAM-setup.py prepare_databases --output_dir DRAM_data --threads 28 2021-04-27 21:32:56.852851: Database preparation started 9:55:06.911831: UniRef database processed 11:26:04.742593: PFAM database processed 11:26:24.184667: dbCAN database processed Traceback (most recent call last): File "/home/liuyulong/miniconda3/envs/dram/bin/DRAM-setup.py", line 146, in
args.func(**args_dict)
File "/home/liuyulong/miniconda3/envs/dram/lib/python3.6/site-packages/mag_annotator/database_processing.py", line 467, in prepare_databases
verbose=verbose)
File "/home/liuyulong/miniconda3/envs/dram/lib/python3.6/site-packages/mag_annotator/database_processing.py", line 214, in download_and_process_viral_refseq
download_file(refseq_url, refseq_faa, verbose=verbose)
File "/home/liuyulong/miniconda3/envs/dram/lib/python3.6/site-packages/mag_annotator/utils.py", line 27, in download_file
run_process(['wget', '-O', output_file, url], verbose=verbose)
File "/home/liuyulong/miniconda3/envs/dram/lib/python3.6/site-packages/mag_annotator/utils.py", line 39, in run_process
stderr=stderr).stdout.decode(errors='ignore')
File "/home/liuyulong/miniconda3/envs/dram/lib/python3.6/subprocess.py", line 418, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['wget', '-O', 'DRAM_data/database_files/viral.1.protein.faa.gz', 'ftp://ftp.ncbi.nlm.nih.gov/refseq/release/viral/viral.1.protein.faa.gz']' returned non-zero exit status 4.
When annotatiDRAM.py annotate -i 1.fa -o 1dram --threads 24 1 fastas found 2021-04-28 09:57:16.859237: Annotation started Traceback (most recent call last): File "/home/liuyulong/miniconda3/envs/dram/bin/DRAM.py", line 153, in
args.func(**args_dict)
File "/home/liuyulong/miniconda3/envs/dram/lib/python3.6/site-packages/mag_annotator/annotate_bins.py", line 969, in annotate_bins_cmd
checkm_quality, rename_bins, keep_tmp_dir, low_mem_mode, threads, verbose)
File "/home/liuyulong/miniconda3/envs/dram/lib/python3.6/site-packages/mag_annotator/annotate_bins.py", line 1000, in annotate_bins
db_handler = DatabaseHandler(db_locs['description_db'])
File "/home/liuyulong/miniconda3/envs/dram/lib/python3.6/site-packages/mag_annotator/database_handler.py", line 17, in init
raise ValueError('Database does not exist at path %s' % database_loc)
ValueError: Database does not exist at path None
Is this mean it is not correct for database location? Thank you