OLC-Bioinformatics / ConFindr

Intra-species bacterial contamination detection
https://olc-bioinformatics.github.io/ConFindr/
MIT License
22 stars 8 forks source link

Error when attempting to run ConFindr on sample #21

Closed connerse closed 1 year ago

connerse commented 3 years ago

Hi there

When trying to run Illumina reads through ConFindr, the following error occurs:

Error encounted was: Traceback (most recent call last): File "/Users/boselab/miniconda3/lib/python3.8/site-packages/confindr_src/confindr.py", line 962, in confindr find_contamination(pair=pair, File "/Users/boselab/miniconda3/lib/python3.8/site-packages/confindr_src/confindr.py", line 587, in find_contamination out, err, cmd = bbtools.bbduk_bait(reference=sample_database, File "/Users/boselab/miniconda3/lib/python3.8/site-packages/confindr_src/wrappers/bbtools.py", line 258, in bbduk_bait out, err = run_subprocess(cmd) File "/Users/boselab/miniconda3/lib/python3.8/site-packages/confindr_src/wrappers/bbtools.py", line 16, in run_subprocess raise subprocess.CalledProcessError(x.returncode, cmd=command) subprocess.CalledProcessError: Command 'bbduk.sh in=AB38/AB38_HYTN5DSXX_AGACCGAATG-CTGGTCGTTC_L002_R1.fastq.gz in2=AB38/AB38_HYTN5DSXX_AGACCGAATG-CTGGTCGTTC_L002_R2.fastq.gz outm=AB38_results/AB38_HYTN5DSXX_AGACCGAATG-CTGGTCGTTC_L002/rmlst_R1.fastq.gz outm2=AB38_results/AB38_HYTN5DSXX_AGACCGAATG-CTGGTCGTTC_L002/rmlst_R2.fastq.gz ref=/Users/boselab/.confindr_db/Rhodomicrobium_db.fasta threads=4' returned non-zero exit status 1.

I am using v. 0.7.0 on MacOSX.

Thank you!

adamkoziol commented 3 years ago

Hi Eric,

Sorry for the delay in responding - I've been on leave.

Are you still having this issue? If so, are you able to upgrade to 0.7.4? There's a few fixes in there that might help.

Have you been able to successfully run ConFindr on the example dataset (details can be found here: https://github.com/OLC-Bioinformatics/ConFindr#quickstart)

If you're still having issues, downgrading Python to 3.7 might help.

In the meantime, are you willing to share your FASTQ files and database, so that I might try some fixes on my end?

SithWijesinghe commented 2 years ago

Hi @adamkoziol , I think I'm getting a similar error to this but on version 0.7.4. I'm using an HPC with Ubuntu 20.04.2 and python 3.7.10. So I'm a bit unsure whether it's relevant to this issue or should be a new one. What do you reckon?

Command:

confindr -i $READDIR -o $OUTDIR -t $THREADS --Xmx 32g

When I tried the example dataset with the same command, it works (detects 214 SNVs)

When I tried it on a paired end sample (MS14811_R1.fastq.gz and MS14811_R2.fastq.gz) with the same command, I get the error below. Tried with -fid/-rid flags and without -Xmx flag but no difference.

2021-07-16 17:22:58 Beginning analysis of sample MS14811... 2021-07-16 17:22:58 Checking for cross-species contamination... 2021-07-16 17:23:27 Extracting conserved core genes... 2021-07-16 17:23:30 Encountered error when attempting to run ConFindr on sample MS14811. Skipping... 2021-07-16 17:23:30 Error encounted was: Traceback (most recent call last): File "/home.roaming/s4187725/.conda/envs/confindr/lib/python3.7/site-packages/confindr_src/confindr.py", line 1067, in confindr fasta=args.fasta) File "/home.roaming/s4187725/.conda/envs/confindr/lib/python3.7/site-packages/confindr_src/confindr.py", line 638, in find_contamination returncmd=True) File "/home.roaming/s4187725/.conda/envs/confindr/lib/python3.7/site-packages/confindr_src/wrappers/bbtools.py", line 258, in bbduk_bait out, err = run_subprocess(cmd) File "/home.roaming/s4187725/.conda/envs/confindr/lib/python3.7/site-packages/confindr_src/wrappers/bbtools.py", line 16, in run_subprocess raise subprocess.CalledProcessError(x.returncode, cmd=command) subprocess.CalledProcessError: Command 'bbduk.sh in=/home.roaming/s4187725/test/confindr/MS14811_R1.fastq.gz in2=/home.roaming/s4187725/test/confindr/MS14811_R2.fastq.gz outm=/home.roaming/s4187725/test/confindr/out/MS14811/rmlst_R1.fastq.gz outm2=/home.roaming/s4187725/test/confindr/out/MS14811/rmlst_R2.fastq.gz ref=/home.roaming/s4187725/.confindr_db/Escherichia_db_cgderived.fasta threads=16 Xmx=32g' returned non-zero exit status 1.

SithWijesinghe commented 2 years ago

Ok, I have an update. I used a cleaned up read file (Trimmed with trimmomatic and other species reads removed with kraken2) of the same sample, and it worked for that one. Confindr threw the same error because I included the raw fastq files as well in the analysis, but I have a result for cleaned up files which I'm happy with.

Sample,Genus,NumContamSNVs,ContamStatus,PercentContam,PercentContamStandardDeviation,BasesExamined,DatabaseDownloadDate MS14811.S2_cleaned,Escherichia,0,False,0,0,38310,ND MS14811.S2_raw,Error processing sample,0,False,ND,ND,0,ND