linnabrown / run_dbcan

Run_dbcan V4, using genomes/metagenomes/proteomes of any assembled organisms (prokaryotes, fungi, plants, animals, viruses) to search for CAZymes.
http://bcb.unl.edu/dbCAN2
GNU General Public License v3.0
146 stars 39 forks source link

UnboundLocalError: cannot access local variable 'gene' where it is not associated with a value #155

Open tgurbich opened 10 months ago

tgurbich commented 10 months ago

Hello!

First of all, thank you very much for this great tool.

I have just updated run_dbcan to version 4.1.3 (I am using biocontainers/bioconda version) and I am running into an error which it looks like is caused by this bit of code: https://github.com/linnabrown/run_dbcan/blob/master/dbcan/cli/run_dbcan.py#L749 There are differences between the code in the container and in the repo but the gene variable seems to be undefined in both.

Here is the command I am running:

run_dbcan --dia_cpu 4 --hmm_cpu 4 --tf_cpu 4 --db_dir 4.1.3-V12 --out_dir results_4.1.3 --cgc_substrate --cluster BU_ATCC8492_noseq.gff BU_ATCC8492.faa protein

Here is the output:

BU_ATCC8492_noseq.gff

***************************1. DIAMOND start*************************************************

diamond v2.1.8.162 (C) Max Planck Society for the Advancement of Science, Benjamin Buchfink, University of Tuebingen
Documentation, support and updates available at http://www.diamondsearch.org
Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)

#CPU threads: 4
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Temporary directory: results_4.1.3
#Target sequences to report alignments for: 1
Opening the database...  [0.128s]
Database: 4.1.3-V12/CAZy (type: Diamond database, sequences: 2816770, letters: 1352352375)
Block size = 2000000000
Building query seed set...  [0.174s]
Algorithm: Query-indexed
Building query histograms...  [0.031s]
Seeking in database...  [0s]
Loading reference sequences...  [5.205s]
Initializing temporary storage...  [0.121s]
Building reference histograms...  [9.286s]
Allocating buffers...  [0s]
Processing query block 1, reference block 1/1, shape 1/2.
Building reference seed array...  [5.59s]
Building query seed array...  [0.021s]
Computing hash join...  [0.682s]
Searching alignments...  [0.669s]
Deallocating memory...  [0s]
Processing query block 1, reference block 1/1, shape 2/2.
Building reference seed array...  [5.145s]
Building query seed array...  [0.018s]
Computing hash join...  [0.647s]
Searching alignments...  [0.67s]
Deallocating memory...  [0s]
Deallocating buffers...  [0.114s]
Clearing query masking...  [0s]
Computing alignments... Loading trace points...  [0.283s]
Sorting trace points...  [0.034s]
Computing alignments...  [8.996s]
Deallocating buffers...  [0.007s]
Loading trace points...  [0s]
 [9.363s]
Deallocating reference...  [0.1s]
Loading reference sequences...  [0s]
Deallocating buffers...  [0s]
Deallocating queries...  [0s]
Total time = 38.045s
Reported 396 pairwise alignments, 396 HSPs.
396 queries aligned.

***************************1. DIAMOND end***************************************************

***************************2. HMMER start*************************************************

***************************2. HMMER end***************************************************

***************************3. dbCAN_sub start***************************************************

total time: 622.9222900867462

***************************3. dbCAN_sub end***************************************************

No substrate for it ('GH3', '-')
No substrate for it ('GH2', '-')
No substrate for it ('CBM57', '3.2.1.31')
No substrate for it ('GH172', '-')
No substrate for it ('GH172', '-')
No substrate for it ('GH144', '-')
No substrate for it ('GH3', '-')
No substrate for it ('GH43', '-')
No substrate for it ('GH43', '-')
No substrate for it ('GH63', '-')
No substrate for it ('GH2', '-')
No substrate for it ('GH2', '-')
No substrate for it ('GH63', '-')
No substrate for it ('GH26', '-')
No substrate for it ('GH2', '-')
No substrate for it ('GT28', '-')
No substrate for it ('GH125', '-')
No substrate for it ('GH144', '-')
No substrate for it ('GH3', '-')
No substrate for it ('GH3', '-')
No substrate for it ('GH144', '-')
No substrate for it ('GH43', '-')
No substrate for it ('CE20', '-')
No substrate for it ('GH43', '-')
No substrate for it ('GH31', '-')
No substrate for it ('GH29', '-')
No substrate for it ('GH2', '-')
No substrate for it ('GH42', '-')
No substrate for it ('GH2', '-')
No substrate for it ('GH2', '-')
No substrate for it ('GH97', '-')
No substrate for it ('GH5', '-')
No substrate for it ('GH2', '-')
No substrate for it ('GH43', '-')
No substrate for it ('CBM48', '3.1.1.73')
No substrate for it ('CBM48', '3.2.1.55')
No substrate for it ('CBM48', '3.2.1.8')
No substrate for it ('CE1', '-')
No substrate for it ('GH3', '-')
No substrate for it ('GH171', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GH171', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GH3', '-')
No substrate for it ('GH3', '-')
No substrate for it ('GH97', '-')
No substrate for it ('GH94', '-')
No substrate for it ('GH2', '-')
No substrate for it ('CE7', '-')
No substrate for it ('GH5', '-')
No substrate for it ('GH5', '-')
No substrate for it ('GH26', '-')
No substrate for it ('GH26', '3.2.1.151')
No substrate for it ('GH26', '3.2.1.4')
No substrate for it ('GH26', '3.2.1.8')
No substrate for it ('GH26', '-')
No substrate for it ('GH26', '-')
No substrate for it ('GH130', '-')
No substrate for it ('GH30', '-')
No substrate for it ('GH76', '-')
No substrate for it ('GH125', '-')
No substrate for it ('GH2', '-')
No substrate for it ('GH92', '-')
No substrate for it ('GH3', '-')
No substrate for it ('GH109', '-')
No substrate for it ('GH109', '-')
No substrate for it ('GH74', '-')
No substrate for it ('GH74', '-')
No substrate for it ('GH74', '-')
No substrate for it ('GH74', '-')
No substrate for it ('GH76', '-')
No substrate for it ('GH92', '-')
No substrate for it ('GH43', '-')
No substrate for it ('GH125', '-')
No substrate for it ('GH127', '-')
No substrate for it ('GH109', '-')
No substrate for it ('GH105', '-')
No substrate for it ('GH3', '-')
No substrate for it ('GH2', '-')
No substrate for it ('GH2', '-')
No substrate for it ('GH97', '-')
No substrate for it ('CE7', '-')
No substrate for it ('GH92', '-')
No substrate for it ('GH92', '3.2.1.114')
No substrate for it ('GH63', '-')
No substrate for it ('GH125', '-')
No substrate for it ('GH20', '-')
No substrate for it ('GH154', '-')
No substrate for it ('PL42', '-')
No substrate for it ('GH105', '-')
No substrate for it ('GH43', '-')
No substrate for it ('GH43', '-')
No substrate for it ('GH97', '-')
No substrate for it ('GH97', '3.2.1.88')
No substrate for it ('GH51', '-')
No substrate for it ('GH127', '-')
No substrate for it ('GH51', '-')
No substrate for it ('GH2', '-')
No substrate for it ('GH43', '-')
No substrate for it ('GH43', '-')
No substrate for it ('GH35', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GH24', '-')
No substrate for it ('CBM20', '2.4.1.25')
No substrate for it ('CBM20', '2.4.1.25')
No substrate for it ('GH77', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GT83', '-')
No substrate for it ('GH18', '-')
No substrate for it ('GH97', '-')
No substrate for it ('GH84', '-')
No substrate for it ('GH20', '-')
No substrate for it ('GH31', '-')
No substrate for it ('GH55', '-')
No substrate for it ('GH55', '-')
No substrate for it ('GH3', '-')
No substrate for it ('GH23', '-')
No substrate for it ('GH109', '-')
No substrate for it ('GH2', '-')
No substrate for it ('GH3', '-')
No substrate for it ('GH3', '-')
No substrate for it ('PL27', '-')
No substrate for it ('PL27', '4.2.2.-')
No substrate for it ('CE12', '-')
No substrate for it ('CE4', '-')
No substrate for it ('GH2', '-')
No substrate for it ('GH28', '-')
No substrate for it ('GH28', '-')
No substrate for it ('GH43', '-')
No substrate for it ('GH30', '-')
No substrate for it ('GH2', '-')
No substrate for it ('GH115', '-')
No substrate for it ('GH18', '-')
No substrate for it ('GT2', '-')
No substrate for it ('CE4', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GH73', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GT3', '-')
No substrate for it ('GT35', '-')
No substrate for it ('GT35', '2.4.1.1')
No substrate for it ('GT2', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GH66', '-')
No substrate for it ('GH31', '-')
No substrate for it ('GH31', '3.2.1.11')
No substrate for it ('GH31', '2.4.1.-')
No substrate for it ('GH97', '-')
No substrate for it ('GT20', '-')
No substrate for it ('GT20', '2.4.1.15')
No substrate for it ('GH15', '-')
No substrate for it ('CBM48', '2.4.1.18')
No substrate for it ('GH108', '-')
No substrate for it ('GT51', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GT1', '-')
No substrate for it ('GH33', '-')
No substrate for it ('GH2', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GH36', '-')
No substrate for it ('GT51', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GT32', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GT26', '-')
No substrate for it ('GH51', '-')
No substrate for it ('GH25', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GH97', '-')
No substrate for it ('GH3', '-')
No substrate for it ('GH158', '-')
No substrate for it ('GH16', '-')
No substrate for it ('GT2', '-')
No substrate for it ('CE1', '-')
No substrate for it ('CE3', '-')
No substrate for it ('GH23', '-')
No substrate for it ('CBM48', '3.2.1.41')
No substrate for it ('CBM48', '3.2.1.1')
No substrate for it ('CBM48', '3.2.1.68')
No substrate for it ('CBM48', '3.2.1.-')
No substrate for it ('GH2', '-')
No substrate for it ('GH53', '-')
No substrate for it ('GH65', '-')
No substrate for it ('GH20', '-')
No substrate for it ('GH32', '-')
No substrate for it ('GH32', '-')
No substrate for it ('GH172', '-')
No substrate for it ('GH172', '-')
No substrate for it ('GH32', '-')
No substrate for it ('GH32', '-')
No substrate for it ('GH9', '-')
No substrate for it ('CE4', '-')
No substrate for it ('GH29', '-')
No substrate for it ('GH24', '-')
No substrate for it ('GT26', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GH92', '-')
No substrate for it ('GH130', '-')
No substrate for it ('GH92', '-')
No substrate for it ('GH92', '3.2.1.114')
No substrate for it ('GH18', '-')
No substrate for it ('GH92', '-')
No substrate for it ('GH92', '-')
No substrate for it ('GH92', '3.2.1.114')
No substrate for it ('GT5', '-')
No substrate for it ('GH57', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GH133', '-')
No substrate for it ('GH116', '-')
No substrate for it ('GH20', '-')
No substrate for it ('GH3', '-')
No substrate for it ('GH172', '-')
No substrate for it ('GH33', '-')
No substrate for it ('GH116', '-')
No substrate for it ('CE11', '-')
No substrate for it ('GH3', '-')
No substrate for it ('GH3', '3.2.1.6')
No substrate for it ('GT19', '-')
No substrate for it ('GH23', '-')
No substrate for it ('GH92', '-')
No substrate for it ('GH125', '-')
No substrate for it ('GH2', '-')
No substrate for it ('GH30', '-')
No substrate for it ('GH20', '-')
No substrate for it ('GT30', '-')
No substrate for it ('GT35', '-')
No substrate for it ('GT35', '2.4.1.1')
No substrate for it ('GH95', '-')
No substrate for it ('GH3', '-')
No substrate for it ('GH2', '-')
No substrate for it ('GH31', '-')
No substrate for it ('GH5', '-')
No substrate for it ('GH43', '-')
No substrate for it ('GH3', '-')
No substrate for it ('GH3', '-')
No substrate for it ('CBM48', '3.1.1.73')
No substrate for it ('CBM48', '3.2.1.55')
No substrate for it ('CBM48', '3.2.1.8')
No substrate for it ('CE1', '-')
No substrate for it ('CBM48', '3.1.1.73')
No substrate for it ('CBM48', '3.2.1.55')
No substrate for it ('CBM48', '3.2.1.8')
No substrate for it ('CE1', '-')
No substrate for it ('CE1', '-')
No substrate for it ('GH3', '-')
No substrate for it ('GT11', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GT9', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GH88', '-')
No substrate for it ('GH2', '-')
No substrate for it ('PL30', '-')
No substrate for it ('GH117', '-')
No substrate for it ('GH117', '-')
No substrate for it ('PL8', '-')
No substrate for it ('PL38', '-')
No substrate for it ('GH3', '-')
No substrate for it ('GH5', '-')
No substrate for it ('CBM67', '3.2.1.40')
No substrate for it ('GH78', '-')
No substrate for it ('GH78', '3.2.1.-')
No substrate for it ('GH43', '-')
No substrate for it ('GH43', '-')
No substrate for it ('GH43', '-')
No substrate for it ('GH43', '-')
No substrate for it ('GH2', '-')
No substrate for it ('GH43', '-')
No substrate for it ('GH31', '-')
No substrate for it ('GH3', '-')
No substrate for it ('GH3', '3.2.1.6')
No substrate for it ('GH78', '-')
No substrate for it ('GH92', '-')
No substrate for it ('GH76', '-')
No substrate for it ('GH125', '-')
No substrate for it ('GH76', '-')
No substrate for it ('GH127', '-')
No substrate for it ('GH38', '-')
No substrate for it ('CBM32', '3.2.1.113')
No substrate for it ('CBM32', '3.2.1.-')
No substrate for it ('GH97', '-')
No substrate for it ('GH2', '-')
No substrate for it ('GH172', '-')
No substrate for it ('GH38', '-')
No substrate for it ('CBM32', '3.2.1.113')
No substrate for it ('CBM32', '3.2.1.-')
No substrate for it ('GH76', '-')
No substrate for it ('GH172', '-')
No substrate for it ('GH172', '-')
No substrate for it ('GH172', '-')
No substrate for it ('GH125', '-')
No substrate for it ('GH97', '-')
No substrate for it ('GH116', '-')
No substrate for it ('GH172', '-')
No substrate for it ('GH172', '-')
No substrate for it ('GH97', '-')
No substrate for it ('CE4', '-')
No substrate for it ('CE20', '-')
No substrate for it ('GH9', '-')
No substrate for it ('GH31', '-')
No substrate for it ('GT2', '-')
No substrate for it ('GT4', '-')
No substrate for it ('GT14', '-')
No substrate for it ('GH38', '-')
No substrate for it ('GH172', '-')
No substrate for it ('GH92', '-')
No substrate for it ('GH172', '-')
No substrate for it ('GH78', '-')
No substrate for it ('GH3', '-')
No substrate for it ('GH5', '-')
No substrate for it ('GH3', '-')
No substrate for it ('GH3', '-')
*****************************CGC-Finder start************************************
diamond v2.1.8.162 (C) Max Planck Society for the Advancement of Science, Benjamin Buchfink, University of Tuebingen
Documentation, support and updates available at http://www.diamondsearch.org
Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)

#CPU threads: 1
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Temporary directory: results_4.1.3
#Target sequences to report alignments for: 1
Opening the database...  [0.016s]
Database: 4.1.3-V12/tcdb.dmnd (type: Diamond database, sequences: 14465, letters: 6343638)
Block size = 2000000000
Algorithm: Double-indexed
Building query histograms...  [0.115s]
Seeking in database...  [0s]
Loading reference sequences...  [0.02s]
Masking reference...  [1.467s]
Initializing temporary storage...  [0.102s]
Building reference histograms...  [0.512s]
Allocating buffers...  [0s]
Processing query block 1, reference block 1/1, shape 1/2, index chunk 1/4.
Building reference seed array...  [0.185s]
Building query seed array...  [0.042s]
Computing hash join...  [0.037s]
Masking low complexity seeds...  [0.001s]
Searching alignments...  [0.006s]
Deallocating memory...  [0s]
Processing query block 1, reference block 1/1, shape 1/2, index chunk 2/4.
Building reference seed array...  [0.202s]
Building query seed array...  [0.046s]
Computing hash join...  [0.038s]
Masking low complexity seeds...  [0.001s]
Searching alignments...  [0.005s]
Deallocating memory...  [0s]
Processing query block 1, reference block 1/1, shape 1/2, index chunk 3/4.
Building reference seed array...  [0.216s]
Building query seed array...  [0.049s]
Computing hash join...  [0.039s]
Masking low complexity seeds...  [0.001s]
Searching alignments...  [0.005s]
Deallocating memory...  [0s]
Processing query block 1, reference block 1/1, shape 1/2, index chunk 4/4.
Building reference seed array...  [0.168s]
Building query seed array...  [0.038s]
Computing hash join...  [0.037s]
Masking low complexity seeds...  [0.001s]
Searching alignments...  [0.005s]
Deallocating memory...  [0s]
Processing query block 1, reference block 1/1, shape 2/2, index chunk 1/4.
Building reference seed array...  [0.174s]
Building query seed array...  [0.041s]
Computing hash join...  [0.038s]
Masking low complexity seeds...  [0.001s]
Searching alignments...  [0.004s]
Deallocating memory...  [0s]
Processing query block 1, reference block 1/1, shape 2/2, index chunk 2/4.
Building reference seed array...  [0.21s]
Building query seed array...  [0.045s]
Computing hash join...  [0.037s]
Masking low complexity seeds...  [0.001s]
Searching alignments...  [0.004s]
Deallocating memory...  [0s]
Processing query block 1, reference block 1/1, shape 2/2, index chunk 3/4.
Building reference seed array...  [0.217s]
Building query seed array...  [0.049s]
Computing hash join...  [0.037s]
Masking low complexity seeds...  [0.001s]
Searching alignments...  [0.004s]
Deallocating memory...  [0s]
Processing query block 1, reference block 1/1, shape 2/2, index chunk 4/4.
Building reference seed array...  [0.167s]
Building query seed array...  [0.038s]
Computing hash join...  [0.038s]
Masking low complexity seeds...  [0.001s]
Searching alignments...  [0.004s]
Deallocating memory...  [0s]
Deallocating buffers...  [0.001s]
Clearing query masking...  [0s]
Computing alignments... Loading trace points...  [0.031s]
Sorting trace points...  [0s]
Computing alignments...  [0.671s]
Deallocating buffers...  [0s]
Loading trace points...  [0s]
 [0.709s]
Deallocating reference...  [0s]
Loading reference sequences...  [0s]
Deallocating buffers...  [0s]
Deallocating queries...  [0s]
Total time = 5.361s
Reported 457 pairwise alignments, 457 HSPs.
457 queries aligned.
Traceback (most recent call last):
  File "/usr/local/bin/run_dbcan", line 10, in <module>
    sys.exit(cli_main())
             ^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/dbcan/cli/run_dbcan.py", line 1105, in cli_main
    run_dbCAN(
  File "/usr/local/lib/python3.12/site-packages/dbcan/cli/run_dbcan.py", line 703, in run_dbCAN
    if gene in cazyme:
       ^^^^
UnboundLocalError: cannot access local variable 'gene' where it is not associated with a value

Would you mind taking a look please?

linnabrown commented 10 months ago

Thank you for pointing this out. We already found this issue since yesterday. And we will update the package to 4.1.4 in this two days.

Currently, you can consider to use 4.1.2 which does not have this issue or using the command to replace xxx.faa protein with xxx.fna meta.

Sorry for the inconvenience.

tgurbich commented 10 months ago

Perfect! Thanks a lot!

linnabrown commented 10 months ago

Hi @tgurbich , we recently updated package from 4.1.3 to 4.1.4. Thank you very much!