benjjneb / dada2

Accurate sample inference from amplicon data with single nucleotide resolution
http://benjjneb.github.io/dada2/
GNU Lesser General Public License v3.0
471 stars 142 forks source link

Mismatched forward and reverse sequence files #983

Closed tangmaomao16 closed 4 years ago

tangmaomao16 commented 4 years ago

I have this error information

DADA2 R package version: 1.6.0
1) Filtering Error in filterAndTrim(unfiltsF, filtsF, unfiltsR, filtsR, truncLen = c(truncLenF,  :
  These are the errors (up to 5) encountered in individual cores...
Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0,  :
  Mismatched forward and reverse sequence files: 60683, 50697.
Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0,  :
  Mismatched forward and reverse sequence files: 50697, 60683.

I find that the relevent raw-data files names are "raw.split.H03.1.fq", "raw.split.H03_1.2.fq", "raw.split.H03_1.1.fq", "raw.split.H03.2.fq". My sequencing samples are "H03" and "H03_1". Does it matter?

So in what situation will DADA2 output such error information? How can I avoid such error if I want to reserve my samples' original names such as "H03" and "H03_1".

benjjneb commented 4 years ago

The fact that you have mirrored mismatched read counts being shown in your error messages -- 60683, 50697 then 50697, 60683 -- suggests that your unfiltsF and unfiltsR are not arranged in the same order. That is, you are mismatching your forward and reverse sequence files by having them in different orders. You can confirm this by inspecting:

cbind(R1=unfiltsF, R2=unfiltsR)

You can potentially fix it by sorting each, i.e. unfiltsF <- sort(unfiltsF).

tangmaomao16 commented 4 years ago

@benjjneb, I use the qiime2 pipeline. The commands are ` source activate qiime2-2018.11

qiime dada2 denoise-paired --i-demultiplexed-seqs demux.qza --p-trunc-len-f 290 --p-trunc-len-r 256 --p-trim-left-f 26 --p-trim-left-r 26 --o-representative-se quences rep-seqs-dada2.qza --o-table table-dada2.qza --p-n-threads 0 --o-denoising-stats stats-dada2.qza --verbose `

Here is the problem. I don't directly use the DADA2 source code. So how can I fix my bug? Where can I look for the DADA2 source code? Or is there any API in qiime2 pipeline that I can make some adjustment for DADA2?

benjjneb commented 4 years ago

I use the qiime2 pipeline

Oh you'll need to look for help on the Qiime2 forum then. The file ordering is probably getting mixed up in the plugin code. You can try to fix it yourself by giving your filenames more distinct names, i.e. don't have sample names that are the same up to _X suffixes.

jjcol commented 4 years ago

Hi,

I am having a similar issue when I run filter and trim

Here is what I am running: out <- filterAndTrim(fnFs, filtFs, fnRs, filtRs, truncLen=c(240,160), maxN=0, maxEE=c(2,2), truncQ=2, rm.phix=TRUE, compress=TRUE, multithread=TRUE) # On Windows set multithread=FALSE head(out)

and here is the error message: Error in filterAndTrim(fnFs, filtFs, fnRs, filtRs, truncLen = c(240, 160), : These are the errors (up to 5) encountered in individual cores... Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0, : Mismatched forward and reverse sequence files: 84260, 100000. Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0, : Mismatched forward and reverse sequence files: 100000, 84298. Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0, : Mismatched forward and reverse sequence files: 7353, 100000. Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0, : Mismatched forward and reverse sequence files: 100000, 7354. Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0, : Mismatched forward and reverse sequence files: 9355, 100000. In addition: Warning message: In mclapply(seq_len(n), do_one, mc.preschedule = mc.preschedule, : all scheduled cores encountered errors in user code

Any idea what is going on?

benjjneb commented 4 years ago

@jjcol

The numbers of reads in the forward and reverse sequence files are different. Are the filenames for each in the same order?

head(cbind(fnFs, fnRs))

Did you do any previous filtering on these files that could have caused mismatches between the forward/reverse read files?

jjcol commented 4 years ago

They are in a different order, but I did not do any filtering before this.

adrianmu commented 3 years ago

Does any one have a solution to this problem, I have seen endless queries about mismatched F & R sequences causing DADA2 errors but no concrete solution. I have experienced the same issue a number of times, can someone please help

R version 3.5.1 (2018-07-02) Loading required package: Rcpp DADA2: 1.10.0 / Rcpp: 1.0.2 / RcppParallel: 4.4.4 1) Filtering Error in filterAndTrim(unfiltsF, filtsF, unfiltsR, filtsR, truncLen = c(truncLenF, : These are the errors (up to 5) encountered in individual cores... Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0, : Mismatched forward and reverse sequence files: 24724, 22011. Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0, : Mismatched forward and reverse sequence files: 1734, 558. Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0, : Mismatched forward and reverse sequence files: 61919, 60550. Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0, : Mismatched forward and reverse sequence files: 43411, 42756. Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0, : Mismatched forward and reverse sequence files: 84320, 83603. Execution halted Traceback (most recent call last): File "/groups2/muwonge_grp/Environments/Miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2_dada2/_denoise.py", line 257, in denoise_paired run_commands([cmd]) File "/groups2/muwonge_grp/Environments/Miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2_dada2/_denoise.py", line 36, in run_commands subprocess.run(cmd, check=True) File "/groups2/muwonge_grp/Environments/Miniconda3/envs/qiime2-2019.10/lib/python3.6/subprocess.py", line 418, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['run_dada_paired.R', '/tmp/tmp40slwt6j/forward', '/tmp/tmp40slwt6j/reverse', '/tmp/tmp40slwt6j/output.tsv.biom', '/tmp/tmp40slwt6j/track.tsv', '/tmp/tmp40slwt6j/filt_f', '/tmp/tmp40slwt6j/filt_r', '130', '130', '0', '0', '2.0', '2.0', '2', 'consensus', '1.0', '30', '20000']' returned non-zero exit status 1.

jjcol commented 3 years ago

I cannot remember how I fixed this when I ran into this issue, but if your data is good enough you could try running it as single end instead. Not ideal but its a start.

adrianmu commented 3 years ago

thanks at @jjcol

MichalOskiera commented 3 years ago

I think Your R1 and R2 do not match - different nb of reads, and adding option matchIDs = TRUE should solve it

wt., 19 paź 2021, 23:41 użytkownik Adrian @.***> napisał:

Does any one have a solution to this problem, I have seen endless queries about mismatched F & R sequences causing DADA2 errors but no concrete solution. I have experienced the same issue a number of times, can someone please help

R version 3.5.1 (2018-07-02) Loading required package: Rcpp DADA2: 1.10.0 / Rcpp: 1.0.2 / RcppParallel: 4.4.4

  1. Filtering Error in filterAndTrim(unfiltsF, filtsF, unfiltsR, filtsR, truncLen = c(truncLenF, : These are the errors (up to 5) encountered in individual cores... Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0, : Mismatched forward and reverse sequence files: 24724, 22011. Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0, : Mismatched forward and reverse sequence files: 1734, 558. Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0, : Mismatched forward and reverse sequence files: 61919, 60550. Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0, : Mismatched forward and reverse sequence files: 43411, 42756. Error in (function (fn, fout, maxN = c(0, 0), truncQ = c(2, 2), truncLen = c(0, : Mismatched forward and reverse sequence files: 84320, 83603. Execution halted Traceback (most recent call last): File "/groups2/muwonge_grp/Environments/Miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2_dada2/_denoise.py", line 257, in denoise_paired run_commands([cmd]) File "/groups2/muwonge_grp/Environments/Miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2_dada2/_denoise.py", line 36, in run_commands subprocess.run(cmd, check=True) File "/groups2/muwonge_grp/Environments/Miniconda3/envs/qiime2-2019.10/lib/python3.6/subprocess.py", line 418, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['run_dada_paired.R', '/tmp/tmp40slwt6j/forward', '/tmp/tmp40slwt6j/reverse', '/tmp/tmp40slwt6j/output.tsv.biom', '/tmp/tmp40slwt6j/track.tsv', '/tmp/tmp40slwt6j/filt_f', '/tmp/tmp40slwt6j/filt_r', '130', '130', '0', '0', '2.0', '2.0', '2', 'consensus', '1.0', '30', '20000']' returned non-zero exit status 1.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/benjjneb/dada2/issues/983#issuecomment-947127677, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARUOJPZ5OV2AXHGPQY4O3LUHXQXVANCNFSM4L3WJJFA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

benjjneb commented 3 years ago

@adrianmu You are running dada2 through the denoise-paired command in QIIME2?

Did you do any filtering on the reads prior to denoise-paired? If so, what was it? Some external programs (also depending on parameter choices) will filter the forward and reverse reads independently, resulting in mismatched filtered fastq files.

adrianmu commented 3 years ago

Many thanks

On Wed, Oct 20, 2021 at 2:23 AM Benjamin Callahan @.***> wrote:

@adrianmu https://github.com/adrianmu You are running dada2 through the denoise-paired command in QIIME2?

Did you do any filtering on the reads prior to denoise-paired? If so, what was it? Some external programs (also depending on parameter choices) will filter the forward and reverse reads independently, resulting in mismatched filtered fastq files.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/benjjneb/dada2/issues/983#issuecomment-947180129, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACO4B6J7NT2AAQDK3QPGLETUHX4VLANCNFSM4L3WJJFA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

robinjugas commented 1 year ago

I used matchIDs=TRUE and it was solved. I think it is because of singleton reads left from previous trimming, alignment, etc in my case.