linnabrown / run_dbcan

Run_dbcan V4, using genomes/metagenomes/proteomes of any assembled organisms (prokaryotes, fungi, plants, animals, viruses) to search for CAZymes.
http://bcb.unl.edu/dbCAN2
GNU General Public License v3.0
138 stars 40 forks source link

Aborted run following CGC-Finder start ? #102

Closed harfeorth closed 2 years ago

harfeorth commented 2 years ago

Hi, I've been using dbCAN for some years and I'm delighted you are continually improving an already excellent resource and tool.

I am running dbcan with the latest installation 3.0.6 with CGC (error is reproducible with E.coli example provided - see call below). Hmm, eCAMI and signalp appear to execute as expected, but the problem seems to come with the CGC-Finder routine. The error I get is the following: ------nohup output start------- EscheriaColiK12MG1655.gff *0. SIGNALP start*** *2. HMMER start*** *2. HMMER end***** *3. eCAMI start***** Using CAZyme db in eCAMI total time:343.944249s *3. eCAMI end***** *****CGC-Finder start**** Traceback (most recent call last): File "/home/sa01dg/miniconda/envs/dbcan/bin/run_dbcan", line 10, in sys.exit(cli_main()) File "/home/sa01dg/miniconda/envs/dbcan/lib/python3.9/site-packages/dbcan_cli/run_dbcan.py", line 675, in cli_main run(inputFile=args.inputFile, inputType=args.inputType, cluster=args.cluster, dbCANFile=args.dbCANFile, File "/home/sa01dg/miniconda/envs/dbcan/lib/python3.9/site-packages/dbcan_cli/run_dbcan.py", line 236, in run runHmmScan(outPath, str(tf_cpu), dbDir, str(tf_eval), str(tf_cov), "tf-1") File "/home/sa01dg/miniconda/envs/dbcan/lib/python3.9/site-packages/dbcan_cli/run_dbcan.py", line 38, in runHmmScan parsed_hmm_output = hmmscan_parser.run(input_file=f"{outPath}h{db_name}.out", eval_num=hmm_eval, coverage=hmm_cov) File "/home/sa01dg/miniconda/envs/dbcan/lib/python3.9/site-packages/dbcan_cli/hmmscan_parser.py", line 43, in run if float(row[4]) <= eval_num and float(row[-1]) >= coverage: TypeError: '<=' not supported between instances of 'float' and 'str' [1]+ Exit 1 nohup run_dbcan EscheriaColiK12MG1655.faa protein --out_dir test --db_dir /media/data/CCAP/DB/dbcan_db --tools {hmmer,eCAMI} -c EscheriaColiK12MG1655.gff --use_signalP=TRUE --gram n > cz.log ---- nohup output end ------

The files produced by this aborted run are: eCAMI.out, hmmer.out, htf-1.out, signalp.neg, uniInput . But turning off CGC finder and signalP with the E.coli example completes and gives 'overview.txt' etc output. Any thoughts most gratefully received?

Thank you for both your work with dbCAN and any help for the above. David

linnabrown commented 2 years ago

Hi David. Sorry for the problem. We will take it a look asap.

1996xjm commented 2 years ago

It works after making minor modifications in file /home/sa01dg/miniconda/envs/dbcan/lib/python3.9/site-packages/dbcan_cli/hmmscan_parser.py.

# line 43 in "/home/sa01dg/miniconda/envs/dbcan/lib/python3.9/site-packages/dbcan_cli/hmmscan_parser.py"
if float(row[4]) <= eval_num and float(row[-1]) >= coverage:
# modify the line above as below 
if float(row[4]) <= float(eval_num) and float(row[-1]) >= float(coverage):
linnabrown commented 2 years ago

Let me take a look right now

linnabrown commented 2 years ago

Thanks @1996xjm , this solved the issue. I will tackle this and put 3.0.7 version later. @harfeorth Thanks for your patience.

harfeorth commented 2 years ago

Thank you @linnabrown and @1996xjm the 'float' solution above and v3.0.7 are working perfectly! regards, david