WrightonLabCSU / DRAM

Distilled and Refined Annotation of Metabolism: A tool for the annotation and curation of function for microbial and viral genomes
GNU General Public License v3.0
239 stars 50 forks source link

errors when prepare database #349

Open mengyuan-JI opened 1 month ago

mengyuan-JI commented 1 month ago

2024-05-24 23:31:45,375 - The subcommand ['hmmpress', '-f', 'DRAM_data/vog_latest_hmms.txt'] experienced an error: Error: File format problem in trying to open HMM file DRAM_data/vog_latest_hmms.txt. File exists, but appears to be empty?

Traceback (most recent call last): File "/home/bioinfo/bin/anaconda3/envs/DRAM/bin/DRAM-setup.py", line 186, in args.func(**args_dict) File "/home/bioinfo/bin/anaconda3/envs/DRAM/lib/python3.10/site-packages/mag_annotator/database_processing.py", line 615, in prepare_databases processed_locs = process_functions[i](locs[i], output_dir, LOGGER, File "/home/bioinfo/bin/anaconda3/envs/DRAM/lib/python3.10/site-packages/mag_annotator/database_processing.py", line 377, in process_vogdb run_process(['hmmpress', '-f', vog_hmms], logger, verbose=verbose) File "/home/bioinfo/bin/anaconda3/envs/DRAM/lib/python3.10/site-packages/mag_annotator/utils.py", line 71, in run_process raise subprocess.SubprocessError(f"The subcommand {' '.join(command)} experienced an error, see the log for more info.") subprocess.SubprocessError: The subcommand hmmpress -f DRAM_data/vog_latest_hmms.txt experienced an error, see the log for more info.

Susanl99 commented 1 month ago

I am facing the same problem! Is anybody know what's wrong with it?

quliping commented 4 weeks ago

2024-05-24 23:31:45,375 - The subcommand ['hmmpress', '-f', 'DRAM_data/vog_latest_hmms.txt'] experienced an error: Error: File format problem in trying to open HMM file DRAM_data/vog_latest_hmms.txt. File exists, but appears to be empty?

Traceback (most recent call last): File "/home/bioinfo/bin/anaconda3/envs/DRAM/bin/DRAM-setup.py", line 186, in args.func(**args_dict) File "/home/bioinfo/bin/anaconda3/envs/DRAM/lib/python3.10/site-packages/mag_annotator/database_processing.py", line 615, in prepare_databases processed_locs = process_functions[i](locs[i], output_dir, LOGGER, File "/home/bioinfo/bin/anaconda3/envs/DRAM/lib/python3.10/site-packages/mag_annotator/database_processing.py", line 377, in process_vogdb run_process(['hmmpress', '-f', vog_hmms], logger, verbose=verbose) File "/home/bioinfo/bin/anaconda3/envs/DRAM/lib/python3.10/site-packages/mag_annotator/utils.py", line 71, in run_process raise subprocess.SubprocessError(f"The subcommand {' '.join(command)} experienced an error, see the log for more info.") subprocess.SubprocessError: The subcommand hmmpress -f DRAM_data/vog_latest_hmms.txt experienced an error, see the log for more info.

Hi, I also got the same error. I browse the scripts of DRAM, and I found the problem in the file 'database_processing.py', which is needed by the 'DRAM-setup.py'. As shown in the following figure, a bug exists in the function 'processvogdb'. Originally, DRAM want to decompress the 'vog.hmm.tar.gz' and cat all vog*.hmm files to a text file 'vog{version}_hmms.txt' ('version' is 'latest' in my process), then run command hmmpress', '-f', vog_hmms (among which 'vog_hmms' here represents the text file containing all hmms). However, please notice the lines merge_files(glob(path.join(hmm_dir, 'VOG*.hmm')), vog_hmms) and hmm_dir = path.join(output_dir, 'vogdb_hmms'), which means the path of the hmm files should be 'outputdir/VOG00001.hmm', 'outputdir/VOG00002.hmm'... Because the hmm files were compress as a 'tar.gz' file directly in the old versions of VOG database. However, since an unknown version of VOG database, the database staff first put all hmm files in the folder 'hmm' then compress as the 'tar.gz' file (please see the second picture). Therefore, the path for the hmm files of recent VOG database versions should be 'outputdir/hmm/VOG00001.hmm', 'outputdir/hmm/VOG00002.hmm'... DRAM could not found the hmm files therefore report the error. image image

This problem could be easily solved by change the scripts locally: 1, if you use the recent versions of VOG database, just change merge_files(glob(path.join(hmm_dir, 'VOG*.hmm')), vog_hmms) to merge_files(glob(path.join(hmm_dir, 'hmm', 'VOG*.hmm')), vog_hmms); 2, if you are not sure for the VOG version, you can add a judgmental logic as this: image

PLEASE NOTE, I am not the developer, so I cannot guarantee that this modification will be successful, nor can I submit the new version of the software on GitHub for you to install directly. You can only modify the script on your own.

By the way, the problem file is located in '/public/apps/anaconda/envs/DRAM1.5.0/lib/python3.10/site-packages/mag_annotator' in my computer. '/public/apps/anaconda/envs/DRAM1.5.0' is the conda environment of DRAM and you can change to your own path.