OLC-Bioinformatics / ConFindr

Intra-species bacterial contamination detection
https://olc-bioinformatics.github.io/ConFindr/
MIT License
22 stars 8 forks source link

input fasta #19

Closed mdabalkey closed 3 years ago

mdabalkey commented 3 years ago

When running ConFindr with --fasta, I get the following error? Running fastq files is ok. Is Confindr designed to work on assembled files? It has the fasta option, seems like it should but it doesn't work.

--fasta If activated, will look for FASTA files instead of FASTQ for unpaired reads. Traceback (most recent call last): File "/nfs/software/apps/ConFindr/0.7.2/lib/python3.7/site-packages/confindr_src/confindr.py", line 1045, in confindr min_matching_hashes=min_matching_hashes) File "/nfs/software/apps/ConFindr/0.7.2/lib/python3.7/site-packages/confindr_src/confindr.py", line 767, in find_contamination out, err = run_cmd(cmd) File "/nfs/software/apps/ConFindr/0.7.2/lib/python3.7/site-packages/confindr_src/confindr.py", line 33, in run_cmd raise subprocess.CalledProcessError(p.returncode, cmd=cmd) subprocess.CalledProcessError: Command 'bbmap.sh ref=output_assembly/SRR2038680/rmlst.fasta in=output_assembly/SRR2038680/trimmed.fastq.gz out=output_assembly/SRR2038680/out_2.bam threads=20 mdtag nodisk' returned non-zero exit status 1.

adamkoziol commented 3 years ago

I've recently updated ConFindr to version 0.7.3 on Bioconda. It contains a few fixes for handling FASTA files. Are you willing to try the new version to see if it addresses your issue?

mdabalkey commented 3 years ago

I tried the new version but I got a very similar error. Traceback (most recent call last): File "/nfs/software/apps/ConFindr/0.7.3/confindr_src/confindr.py", line 1051, in confindr find_contamination(pair=fastq, File "/nfs/software/apps/ConFindr/0.7.3/confindr_src/confindr.py", line 673, in find_contamination out, err, cmd = bbtools.bbduk_trim(forward_in=os.path.join(sample_tmp_dir, 'rmlst.fastq.gz'), File "/nfs/software/apps/ConFindr/0.7.3/confindr_src/wrappers/bbtools.py", line 108, in bbduk_trim out, err = run_subprocess(cmd) File "/nfs/software/apps/ConFindr/0.7.3/confindr_src/wrappers/bbtools.py", line 16, in run_subprocess raise subprocess.CalledProcessError(x.returncode, cmd=command) subprocess.CalledProcessError: Command 'bbduk.sh in=results_assembly/SRR2038680/rmlst.fastq.gz out=results_assembly/SRR2038680/trimmed.fastq.gz qtrim=w trimq=20 k=25 minlength=50 forcetrimleft=15 ref=adapters overwrite hdist=1 tpe tbo threads=40' returned non-zero exit status 1.

adamkoziol commented 3 years ago

Based on the supplied traceback, you are using an assesmbly for SRR2038680? I'll try to run that through ConFindr, and debug it on my end.

mdabalkey commented 3 years ago

yes, I assembled with SKESA and run it with: confindr.py -i SRR2038680_assembly -o assembly_results --fasta -d path/to/database

mdabalkey commented 3 years ago

I know it works with fastq files but I want to make sure that the functionality for fasta files works. If it does, I will have to check my environment. Would it be possible to confirm that the tool works with fasta files? Thank you!

adamkoziol commented 3 years ago

It is supposed to work with FASTA files. I've been trying to resolve your issue on my end. Making some progress.

adamkoziol commented 3 years ago

It looks like the issue was due to a bug where if the -Xmx flag was not included, confindr was looking for the wrong input file. Version 0.7.4 has been uploaded to Bioconda, and it includes this fix. Additionally, you can include the -Xmx flag with an appropriate memory value to get it to work in version 0.7.3.

Please let me know if this addresses the issue.

mdabalkey commented 3 years ago

Thank you for checking on this issue, the new version works for fasta files.