benjjneb / dada2

Accurate sample inference from amplicon data with single nucleotide resolution
http://benjjneb.github.io/dada2/
GNU Lesser General Public License v3.0
469 stars 142 forks source link

ITS filterAndTrim: No reads passed the filter + Denoising Errors dadaFs/Rs <- dada() #1795

Closed jordan-dwm closed 4 months ago

jordan-dwm commented 1 year ago

Hi there! I hope you are doing well. I apologize if this has already been asked, but I am having trouble proceeding with the processing of my ITS sequences with DADA2.

> packageVersion("dada2") [1] ‘1.28.0’

Admittedly, the quality scores for my samples are not great, as it was difficult retrieving fungal ITS sequences from leaf samples.

image

image

When I get to the filterAndTrim step, I see a warning that for some of my samples, "No reads passed the filter":

> out <- filterAndTrim(cutFs, filtFs, cutRs, filtRs, maxN = 0, maxEE = c(2, 2), truncQ = 2,
+     minLen = 50, rm.phix = TRUE, compress = TRUE, multithread = FALSE)  # on windows, set multithread = FALSE

The filter removed all reads: C:\Users\jwils\OneDrive\Documents\Grad_Studies\Microbial_Leaf_Spec_Project\microbe_data\MiSeq_data\MiSeq_ITS1\Phenology\cutadapt\filtered\MI.M05812_0316.001.FLD_ill_122_i7---IDT_i5_9.CABO_112_ITS_R1.fastq and C:\Users\jwils\OneDrive\Documents\Grad_Studies\Microbial_Leaf_Spec_Project\microbe_data\MiSeq_data\MiSeq_ITS1\Phenology\cutadapt\filtered\MI.M05812_0316.001.FLD_ill_122_i7---IDT_i5_9.CABO_112_ITS_R2.fastq not written.
The filter removed all reads: C:\Users\jwils\OneDrive\Documents\Grad_Studies\Microbial_Leaf_Spec_Project\microbe_data\MiSeq_data\MiSeq_ITS1\Phenology\cutadapt\filtered\MI.M05812_0316.001.FLD_ill_123_i7---IDT_i5_9.CABO_124_ITS_R1.fastq and C:\Users\jwils\OneDrive\Documents\Grad_Studies\Microbial_Leaf_Spec_Project\microbe_data\MiSeq_data\MiSeq_ITS1\Phenology\cutadapt\filtered\MI.M05812_0316.001.FLD_ill_123_i7---IDT_i5_9.CABO_124_ITS_R2.fastq not written.
The filter removed all reads: C:\Users\jwils\OneDrive\Documents\Grad_Studies\Microbial_Leaf_Spec_Project\microbe_data\MiSeq_data\MiSeq_ITS1\Phenology\cutadapt\filtered\MI.M05812_0316.001.FLD_ill_130_i7---IDT_i5_10.CABO_113_ITS_R1.fastq and C:\Users\jwils\OneDrive\Documents\Grad_Studies\Microbial_Leaf_Spec_Project\microbe_data\MiSeq_data\MiSeq_ITS1\Phenology\cutadapt\filtered\MI.M05812_0316.001.FLD_ill_130_i7---IDT_i5_10.CABO_113_ITS_R2.fastq not written.
The filter removed all reads: C:\Users\jwils\OneDrive\Documents\Grad_Studies\Microbial_Leaf_Spec_Project\microbe_data\MiSeq_data\MiSeq_ITS1\Phenology\cutadapt\filtered\MI.M05812_0316.001.FLD_ill_153_i7---IDT_i5_12.CABO_008_ITS_R1.fastq and C:\Users\jwils\OneDrive\Documents\Grad_Studies\Microbial_Leaf_Spec_Project\microbe_data\MiSeq_data\MiSeq_ITS1\Phenology\cutadapt\filtered\MI.M05812_0316.001.FLD_ill_153_i7---IDT_i5_12.CABO_008_ITS_R2.fastq not written.
Warning: cannot remove file 'C:\Users\jwils\OneDrive\Documents\Grad_Studies\Microbial_Leaf_Spec_Project\microbe_data\MiSeq_data\MiSeq_ITS1\Phenology\cutadapt\filtered\MI.M05812_0316.001.FLD_ill_153_i7---IDT_i5_12.CABO_008_ITS_R1.fastq', reason 'No such file or directory'Warning: cannot remove file 'C:\Users\jwils\OneDrive\Documents\Grad_Studies\Microbial_Leaf_Spec_Project\microbe_data\MiSeq_data\MiSeq_ITS1\Phenology\cutadapt\filtered\MI.M05812_0316.001.FLD_ill_153_i7---IDT_i5_12.CABO_008_ITS_R2.fastq', reason 'No such file or directory'Some input samples had no reads pass the filter.

When I look at the R1/R2 (forward/reverse) FASTQ files for these samples, I do notice that some samples really have very few sequences. For example, sample 112 up there only has 4 sequences present in both the forward and reverse FASTQ files. So my issue is, when I get to the denoising step, I receive an error because some samples have no reads:

> dadaFs <- dada(filtFs, err=errF, multithread=FALSE)
Error in dada(filtFs, err = errF, multithread = FALSE) : 
  Some of the filenames provided do not exist. This may have happened because some samples had zero reads after filtering.

I have tried adjusting the filterAndTrim parameters to loosen restrictions, but that did not help. Also note that the reason I have 'multithread = FALSE' is because I am using Windows and it doesn't seem to work otherwise.

Are you able to help me find a solution to bypass this error or do you have any tips? I don't think I can't proceed with this error in denoising. Thanks so much for your time, it is highly appreciated! :)

benjjneb commented 1 year ago

If it is just a few sample with very few (bad) reads, then this should just be worked around. That said, it is worth checking that is true in the out matrix. Also, you seem to have some extra Warning at the end of the output text after filterAndTrim that I don't quite understand.

The workaround is easy though:

passed.filtering <- file.exist(filtFs)
filtFs <- filtFs[passed.filtering]
filtRs <- filtRs[passed.filtering]

Then just move forward as normal, with only the reads with >0 filtered reads included.

jordan-dwm commented 1 year ago

Thanks so much for your help! I appreciate that. I figured since it was so few, I could just work around them, but I wanted to be sure from the expert. Take care!