linnabrown / run_dbcan

Run_dbcan V4, using genomes/metagenomes/proteomes of any assembled organisms (prokaryotes, fungi, plants, animals, viruses) to search for CAZymes.
http://bcb.unl.edu/dbCAN2
GNU General Public License v3.0
145 stars 40 forks source link

Inquiry Regarding KeyError 'TC' in dbcan_utils CGC_abund Function #181

Open jiangys30 opened 4 months ago

jiangys30 commented 4 months ago

Report

Hello, I am a new user of dbcan. While running the following command: dbcan_utils CGC_abund -bt Bv_WXH_abund/Bv_WXH.depth.txt -i dbcan_result -a TPM I encountered the following error:

Traceback (most recent call last):
  File "/lustre1/g/aos_shihuang/tools/anaconda3/envs/dbcan/bin/dbcan_utils", line 10, in <module>
    sys.exit(main())
  File "/lustre1/g/aos_shihuang/tools/anaconda3/envs/dbcan/lib/python3.8/site-packages/dbcan/utils/utils.py", line 619, in main
    PUL_abundance(args)
  File "/lustre1/g/aos_shihuang/tools/anaconda3/envs/dbcan/lib/python3.8/site-packages/dbcan/utils/utils.py", line 493, in PUL_abundance
    PUL_abund.output_cgc_abund()
  File "/lustre1/g/aos_shihuang/tools/anaconda3/envs/dbcan/lib/python3.8/site-packages/dbcan/utils/utils.py", line 462, in output_cgc_abund
    cgc_standard_records.append(self.cgcid2cgc_standard[cgcid])
KeyError: 'TC'

This error indicates that the self.cgcid2cgc_standard dictionary does not contain the key 'TC' when executing the output_cgc_abund function.

Additionally, while running: dbcan_utils CGC_substrate_abund -bt Bv_WXH_abund/Bv_WXH.depth.txt -i dbcan_result -a TPM I encountered a similar error:

Traceback (most recent call last):
  File "/lustre1/g/aos_shihuang/tools/anaconda3/envs/dbcan/bin/dbcan_utils", line 10, in <module>
    sys.exit(main())
  File "/lustre1/g/aos_shihuang/tools/anaconda3/envs/dbcan/lib/python3.8/site-packages/dbcan/utils/utils.py", line 622, in main
    PUL_Substrate_abundance(args)
  File "/lustre1/g/aos_shihuang/tools/anaconda3/envs/dbcan/lib/python3.8/site-packages/dbcan/utils/utils.py", line 500, in PUL_Substrate_abundance
    PUL_abund.Cal_PUL_Substrate_Abundance()
  File "/lustre1/g/aos_shihuang/tools/anaconda3/envs/dbcan/lib/python3.8/site-packages/dbcan/utils/utils.py", line 371, in Cal_PUL_Substrate_Abundance
    cgc_abunds = self.cgcid2seqabund[cgcid]  ### list constis of sequence abundance
KeyError: 'NODE_4_length_250273_cov_247.905346|CGC1'

Can anyone please guide me on how to resolve these issues? Attached is the screenshot of my input file, the cgc_standard.out file. I noticed in a previous issue tracker that the input file for dbcan_utils CGC_abund is cgc_standard.out. However, I am unsure what the input should be for dbcan_utils CGC_substrate_abund. image

Here is some additional information that might be helpful: I followed the protocol outlined in "Run from Raw Reads: Automated CAZyme and Glycan Substrate Annotation in Microbiomes: A Step-by-Step Protocol". However, I did not use all the software recommended in the tutorial. Instead, I utilized alternative software for some steps because I had pre-developed parts of the workflow. My workflow diagram is as follows: image

Version information

No response

ZhengJinfang1220 commented 4 months ago

Report

Hello, I am a new user of dbcan. While running the following command: dbcan_utils CGC_abund -bt Bv_WXH_abund/Bv_WXH.depth.txt -i dbcan_result -a TPM I encountered the following error:

Traceback (most recent call last):
  File "/lustre1/g/aos_shihuang/tools/anaconda3/envs/dbcan/bin/dbcan_utils", line 10, in <module>
    sys.exit(main())
  File "/lustre1/g/aos_shihuang/tools/anaconda3/envs/dbcan/lib/python3.8/site-packages/dbcan/utils/utils.py", line 619, in main
    PUL_abundance(args)
  File "/lustre1/g/aos_shihuang/tools/anaconda3/envs/dbcan/lib/python3.8/site-packages/dbcan/utils/utils.py", line 493, in PUL_abundance
    PUL_abund.output_cgc_abund()
  File "/lustre1/g/aos_shihuang/tools/anaconda3/envs/dbcan/lib/python3.8/site-packages/dbcan/utils/utils.py", line 462, in output_cgc_abund
    cgc_standard_records.append(self.cgcid2cgc_standard[cgcid])
KeyError: 'TC'

This error indicates that the self.cgcid2cgc_standard dictionary does not contain the key 'TC' when executing the output_cgc_abund function.

Additionally, while running: dbcan_utils CGC_substrate_abund -bt Bv_WXH_abund/Bv_WXH.depth.txt -i dbcan_result -a TPM I encountered a similar error:

Traceback (most recent call last):
  File "/lustre1/g/aos_shihuang/tools/anaconda3/envs/dbcan/bin/dbcan_utils", line 10, in <module>
    sys.exit(main())
  File "/lustre1/g/aos_shihuang/tools/anaconda3/envs/dbcan/lib/python3.8/site-packages/dbcan/utils/utils.py", line 622, in main
    PUL_Substrate_abundance(args)
  File "/lustre1/g/aos_shihuang/tools/anaconda3/envs/dbcan/lib/python3.8/site-packages/dbcan/utils/utils.py", line 500, in PUL_Substrate_abundance
    PUL_abund.Cal_PUL_Substrate_Abundance()
  File "/lustre1/g/aos_shihuang/tools/anaconda3/envs/dbcan/lib/python3.8/site-packages/dbcan/utils/utils.py", line 371, in Cal_PUL_Substrate_Abundance
    cgc_abunds = self.cgcid2seqabund[cgcid]  ### list constis of sequence abundance
KeyError: 'NODE_4_length_250273_cov_247.905346|CGC1'

Can anyone please guide me on how to resolve these issues? Attached is the screenshot of my input file, the cgc_standard.out file. I noticed in a previous issue tracker that the input file for dbcan_utils CGC_abund is cgc_standard.out. However, I am unsure what the input should be for dbcan_utils CGC_substrate_abund. image

Here is some additional information that might be helpful: I followed the protocol outlined in "Run from Raw Reads: Automated CAZyme and Glycan Substrate Annotation in Microbiomes: A Step-by-Step Protocol". However, I did not use all the software recommended in the tutorial. Instead, I utilized alternative software for some steps because I had pre-developed parts of the workflow. My workflow diagram is as follows: image

Version information

No response

Hi, the error prompt is wired. I did not encounter this bug before. Could you share the data with me (zhengjinfang1220@gmail.com)?