I try to analyze paired 16S rRNA reads from prokaryotes, which I extracted from metagenome data (sequenced on Illumina NextSeq) using SortMeRNA. The metagenome reads have been trimmed (read length min 50bp, Phred 20) before the extraction. This is also visible in the PlotQualityProfile.
I can run the pipeline up to the learnErrors() function, where I get the following message:
Error in getErrors(err, enforce = TRUE) : Error matrix is NULL
Also after changing parameter it remains the same. Is this caused by the low error rate of the reads (because they were trimmed before)?
Do you have an idea how I can solve the issue?
Hello,
I try to analyze paired 16S rRNA reads from prokaryotes, which I extracted from metagenome data (sequenced on Illumina NextSeq) using SortMeRNA. The metagenome reads have been trimmed (read length min 50bp, Phred 20) before the extraction. This is also visible in the PlotQualityProfile.
This is the code I was using: `
Sort samples
fastqs <- fns[grepl(".fastqsanger$", fns)] fastqs <- sort(fastqs) fnFs <- fastqs[grepl("_forward.fastqsanger", fastqs)] fnRs <- fastqs[grepl("_reverse.fastqsanger", fastqs)]
sample.names <- sapply(strsplit(fnFs, "_"),
[
, 1)Specify the full path to the fnFs and fnRs
fnFs <- file.path(path, fnFs) fnRs <- file.path(path, fnRs)
Quality Plot
plotQualityProfile(RH2020F[1:4])
Make directory and filenames for the filtered fastqs
filt_path <- file.path(path, "filtered") if(!file_test("-d", filt_path)) dir.create(filt_path) filtFs <- file.path(filt_path, paste0(sample.names, "_F_filt.fastq.gz")) filtRs <- file.path(filt_path, paste0(sample.names, "_R_filt.fastq.gz"))
Filter and dereplicate
for(i in seq_along(fnFs)) { fastqPairedFilter(c(fnFs[i], fnRs[i]), c(filtFs[i], filtRs[i]), maxN=0, maxEE=c(2,2), rm.phix=TRUE, compress=TRUE, verbose=TRUE) }
out <- filterAndTrim(RH2020F, filtFs, RH2020R, filtRs,rm.phix=TRUE,maxEE=c(2,2), minLen = 50, compress=TRUE, multithread=TRUE) head(out)
derepFs <- derepFastq(filtFs, qualityType="FastqQuality", verbose=TRUE) derepRs <- derepFastq(filtRs,qualityType="FastqQuality", verbose=TRUE)
names(derepFs) <- sample.names names(derepRs) <- sample.names
Learn error rates
errF <- learnErrors(derepFs, randomize = TRUE, multithread=TRUE, MAX_CONSIST = 20, nbases=1e12) or errR <- learnErrors(filtRs, multithread=TRUE) or dadaFs.lrn <- dada(derepFs, err=NULL, selfConsist = TRUE, multithread=TRUE)
`
I can run the pipeline up to the learnErrors() function, where I get the following message: Error in getErrors(err, enforce = TRUE) : Error matrix is NULL
Also after changing parameter it remains the same. Is this caused by the low error rate of the reads (because they were trimmed before)? Do you have an idea how I can solve the issue?
Thanks and kind regards!