margePairs problem - Githubissues

MatS792 commented 4 years ago

Hi, I'm trying to use dada2 to analyze 16S sequences but I have problems with the margePairs function that gives 0 paired reads merged. I tried to use different truncation lengths, maxEE and truncQ to balance overlap and read quality, respectively. Nevertheless, the merging was always 0 for most of the reads. This is the pipeline I used with some output:

library(dada2) packageVersion("dada2") → 1.14.1

path1 <- "/home/issue_github" list.files(path1) # Verify the file list

fnFs1 <- sort(list.files(path1, pattern="_R1_001.fastq", full.names = TRUE)) fnRs1 <- sort(list.files(path1, pattern="_R2_001.fastq", full.names = TRUE))

sample.names1 <- sapply(strsplit(basename(fnFs1), "_"), [, 1)

filtFs1 <- file.path(path1, "filtered", paste0(sample.names1, "_F_filt.fastq.gz")) filtRs1 <- file.path(path1, "filtered", paste0(sample.names1, "_R_filt.fastq.gz"))

names(filtFs1) <- sample.names1 names(filtRs1) <- sample.names1

Forward sequences

plotQualityProfile(fnFs1[3:7]) plotQF

Reverse sequences

plotQualityProfile(fnRs1[3:7]) plotQR

out1 <- filterAndTrim(fnFs1, filtFs1, fnRs1, filtRs1, truncLen=c(221,171), maxN=0, maxEE=c(5,5), truncQ=9, rm.phix=TRUE, compress=TRUE, multithread=TRUE) out1

	reads.in	reads.out
1_S4_L001_R1_001.fastq.gz	30660	30624
10_S17_L001_R1_001.fastq.gz	21668	21634
11_S29_L001_R1_001.fastq.gz	40759	40726
12_S41_L001_R1_001.fastq.gz	20546	20522
13_S53_L001_R1_001.fastq.gz	35109	35056
14_S65_L001_R1_001.fastq.gz	37469	37430
15_S77_L001_R1_001.fastq.gz	45975	45935
16_S89_L001_R1_001.fastq.gz	28013	27978
17_S6_L001_R1_001.fastq.gz	19992	19963
18_S18_L001_R1_001.fastq.gz	29256	29222
19_S30_L001_R1_001.fastq.gz	33458	33412
2_S16_L001_R1_001.fastq.gz	28956	28935
20_S42_L001_R1_001.fastq.gz	39858	39813
21_S54_L001_R1_001.fastq.gz	31244	31196
3_S28_L001_R1_001.fastq.gz	41611	41574
4_S40_L001_R1_001.fastq.gz	13891	13871
5_S52_L001_R1_001.fastq.gz	42859	42806
6_S64_L001_R1_001.fastq.gz	38051	38021
7_S76_L001_R1_001.fastq.gz	41910	41872
8_S88_L001_R1_001.fastq.gz	27451	27414
9_S5_L001_R1_001.fastq.gz	21935	21906

errF1 <- learnErrors(filtFs1, multithread=TRUE, randomize=TRUE) plotErrors(errF1, nominalQ=TRUE) errF errR1 <- learnErrors(filtRs1, multithread=TRUE, randomize=TRUE) plotErrors(errR1, nominalQ=TRUE) errR derepFs1 <- derepFastq(filtFs1, verbose=TRUE) derepRs1 <- derepFastq(filtRs1, verbose=TRUE)

names(derepFs1) <- sample.names1 names(derepRs1) <- sample.names1

dadaFs1 <- dada(derepFs1, err=errF1, pool="pseudo", multithread=TRUE) dadaRs1 <- dada(derepRs1, err=errR1, pool="pseudo", multithread=TRUE)

mergers1 <- mergePairs(dadaFs1, derepFs1, dadaRs1, derepRs1, verbose=TRUE)

128 paired-reads (in 5 unique pairings) successfully merged out of 27732 (in 9091 pairings) input. 0 paired-reads (in 0 unique pairings) successfully merged out of 19166 (in 7612 pairings) input. 0 paired-reads (in 0 unique pairings) successfully merged out of 37072 (in 13859 pairings) input. 0 paired-reads (in 0 unique pairings) successfully merged out of 18236 (in 6702 pairings) input. 0 paired-reads (in 0 unique pairings) successfully merged out of 31660 (in 11839 pairings) input. 0 paired-reads (in 0 unique pairings) successfully merged out of 34434 (in 10305 pairings) input. 0 paired-reads (in 0 unique pairings) successfully merged out of 41381 (in 13940 pairings) input. 16 paired-reads (in 1 unique pairings) successfully merged out of 25080 (in 7844 pairings) input. 0 paired-reads (in 0 unique pairings) successfully merged out of 17700 (in 7060 pairings) input. 0 paired-reads (in 0 unique pairings) successfully merged out of 26761 (in 9365 pairings) input. 0 paired-reads (in 0 unique pairings) successfully merged out of 30259 (in 9612 pairings) input. 6 paired-reads (in 1 unique pairings) successfully merged out of 26275 (in 8955 pairings) input. 0 paired-reads (in 0 unique pairings) successfully merged out of 36094 (in 12412 pairings) input. 0 paired-reads (in 0 unique pairings) successfully merged out of 28188 (in 10335 pairings) input. 0 paired-reads (in 0 unique pairings) successfully merged out of 37389 (in 14522 pairings) input. 0 paired-reads (in 0 unique pairings) successfully merged out of 11890 (in 5013 pairings) input. 24 paired-reads (in 1 unique pairings) successfully merged out of 38616 (in 13027 pairings) input. 0 paired-reads (in 0 unique pairings) successfully merged out of 34671 (in 12492 pairings) input. 0 paired-reads (in 0 unique pairings) successfully merged out of 38302 (in 13731 pairings) input. 0 paired-reads (in 0 unique pairings) successfully merged out of 24090 (in 7063 pairings) input. 0 paired-reads (in 0 unique pairings) successfully merged out of 19264 (in 7941 pairings) input.

Thanks for your help.

benjjneb commented 4 years ago

What is your library setup? I.e. what primers are you using, are they included on the reads, and what is the length of the sequenced amplicon?

MatS792 commented 4 years ago

Following your suggestions, I checked the sequences and I found that I was working on sequences already filtered and trimmed.. With the right sequences I have no problems. Thank you for your help, I will more careful next time.

EJS01 commented 2 years ago

What is your library setup? I.e. what primers are you using, are they included on the reads, and what is the length of the sequenced amplicon?

Good day, I am new to bioinformatics... how can I check this parameters?

benjjneb / dada2

margePairs problem #1000

Forward sequences

Reverse sequences