OLC-Bioinformatics / ConFindr

Intra-species bacterial contamination detection
https://olc-bioinformatics.github.io/ConFindr/
MIT License
22 stars 8 forks source link

KeyError when running ConFindr v0.8.1 on test dataset #49

Closed pcrxn closed 1 year ago

pcrxn commented 1 year ago
$ confindr -i test_samples/ -o test_out -fid '_1' -rid '_2'
  2023-05-31 13:39:33  Welcome to ConFindr 0.8.1! Beginning analysis of your samples... 
  2023-05-31 13:39:33  Did not find rMLST databases, if you want to use ConFindr on genera other than Listeria, Salmonella, and Escherichia, you'll need to download them. Instructions are available at https://olc-bioinformatics.github.io/ConFindr/install/#downloading-confindr-databases

  2023-05-31 13:39:33  Beginning analysis of sample SRX5084910_SRR8268082... 
  2023-05-31 13:39:33  Checking for cross-species contamination... 
  2023-05-31 13:39:35  Extracting conserved core genes... 
  2023-05-31 13:39:36  Quality trimming... 
  2023-05-31 13:39:36  Detecting contamination... 
  2023-05-31 13:39:36  Encountered error when attempting to run ConFindr on sample SRX5084910_SRR8268082. Skipping... 
  2023-05-31 13:39:36  Error encountered was:
Traceback (most recent call last):
  File "/home/liam/Desktop/confindr_test/bin/ConFindr/confindr_src/confindr.py", line 103, in confindr
    debug=args.verbosity)
  File "/home/liam/Desktop/confindr_test/bin/ConFindr/confindr_src/methods.py", line 1353, in find_contamination
    out, err = run_cmd(cmd)
  File "/home/liam/Desktop/confindr_test/bin/ConFindr/confindr_src/methods.py", line 85, in run_cmd
    raise subprocess.CalledProcessError(p.returncode, cmd=cmd)
subprocess.CalledProcessError: Command 'kma -ipe test_out/SRX5084910_SRR8268082/SRX5084910_SRR8268082_baited_trimmed_R1.fastq.gz test_out/SRX5084910_SRR8268082/SRX5084910_SRR8268082_baited_trimmed_R2.fastq.gz -t_db /home/liam/.confindr_db/Escherichia_db_cgderived_kma -o test_out/SRX5084910_SRR8268082/SRX5084910_SRR8268082_kma -t 20' returned non-zero exit status 1

  2023-05-31 13:39:36  Beginning analysis of sample SRX5084911_SRR8268081... 
  2023-05-31 13:39:36  Checking for cross-species contamination... 
  2023-05-31 13:39:38  Extracting conserved core genes... 
  2023-05-31 13:39:39  Quality trimming... 
  2023-05-31 13:39:39  Detecting contamination... 
  2023-05-31 13:39:39  Encountered error when attempting to run ConFindr on sample SRX5084911_SRR8268081. Skipping... 
  2023-05-31 13:39:39  Error encountered was:
Traceback (most recent call last):
  File "/home/liam/Desktop/confindr_test/bin/ConFindr/confindr_src/confindr.py", line 103, in confindr
    debug=args.verbosity)
  File "/home/liam/Desktop/confindr_test/bin/ConFindr/confindr_src/methods.py", line 1353, in find_contamination
    out, err = run_cmd(cmd)
  File "/home/liam/Desktop/confindr_test/bin/ConFindr/confindr_src/methods.py", line 85, in run_cmd
    raise subprocess.CalledProcessError(p.returncode, cmd=cmd)
subprocess.CalledProcessError: Command 'kma -ipe test_out/SRX5084911_SRR8268081/SRX5084911_SRR8268081_baited_trimmed_R1.fastq.gz test_out/SRX5084911_SRR8268081/SRX5084911_SRR8268081_baited_trimmed_R2.fastq.gz -t_db /home/liam/.confindr_db/Escherichia_db_cgderived_kma -o test_out/SRX5084911_SRR8268081/SRX5084911_SRR8268081_kma -t 20' returned non-zero exit status 1

  2023-05-31 13:39:39  Beginning analysis of sample SRX5084914_SRR8268078... 
  2023-05-31 13:39:39  Checking for cross-species contamination... 
  2023-05-31 13:39:41  Extracting conserved core genes... 
  2023-05-31 13:39:42  Quality trimming... 
  2023-05-31 13:39:42  Detecting contamination... 
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/liam/miniconda3/envs/confindr/lib/python3.5/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/home/liam/miniconda3/envs/confindr/lib/python3.5/multiprocessing/pool.py", line 47, in starmapstar
    return list(itertools.starmap(args[0], args[1]))
  File "/home/liam/Desktop/confindr_test/bin/ConFindr/confindr_src/methods.py", line 872, in read_contig
    base_fraction_cutoff=base_fraction_cutoff)
  File "/home/liam/Desktop/confindr_test/bin/ConFindr/confindr_src/methods.py", line 695, in find_multibase_positions
    passing_snv_dict['congruent'][base] += count
KeyError: 'T'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/liam/miniconda3/envs/confindr/bin/confindr", line 11, in <module>
    load_entry_point('confindr', 'console_scripts', 'confindr')()
  File "/home/liam/Desktop/confindr_test/bin/ConFindr/confindr_src/confindr.py", line 243, in main
    confindr(args)
  File "/home/liam/Desktop/confindr_test/bin/ConFindr/confindr_src/confindr.py", line 103, in confindr
    debug=args.verbosity)
  File "/home/liam/Desktop/confindr_test/bin/ConFindr/confindr_src/methods.py", line 1516, in find_contamination
    chunksize=1):
  File "/home/liam/miniconda3/envs/confindr/lib/python3.5/multiprocessing/pool.py", line 274, in starmap
    return self._map_async(func, iterable, starmapstar, chunksize).get()
  File "/home/liam/miniconda3/envs/confindr/lib/python3.5/multiprocessing/pool.py", line 644, in get
    raise self._value
KeyError: 'T'
pcrxn commented 1 year ago

Bioconda recipe for ConFindr v0.8.1 doesn't enforce the BioPython version, and has Python>=3.0 instead of Python>=3.9.5. Other package versions are older, too.

pcrxn commented 1 year ago

Resolved by https://github.com/bioconda/bioconda-recipes/pull/41438.