Closed luigallucci closed 3 months ago
this was my trial with seqtab obtained through normal pipeline and seqtabA with BigData pipe.
Moreover, I read the latest issues about the pooling method and the chimaera removal method...if I'm using dada2 in R (not Qiime), using pooling for my dataset, should I select a different method for chimaera removal, than consensus?
IF using dada(..., pool=TRUE)
THEN use removeBimeraDenovo(..., method="pooled")
.
IF using dada(..., pool="pseudo")
THEN use removeBimeraDenovo(..., method="consensus")
(the default).
IF using dada(..., pool=FALSE)
(the default) THEN use removeBimeraDenovo(..., method="consensus")
(the default).
Is it possible to use pool=TRUE in the big data pipeline reliably?
It isn't, because the big data workflow is processing each sample independently. To use pool=TRUE
, all samples have to be loaded into memory at once.
this was my trial with seqtab obtained through normal pipeline and seqtabA with BigData pipe.
These differences are small, and likely are related to low abundance/low confidence ASVs. That said, I'm not sure exactly why you would get a different result between the two. I would err on the side of the regular tutorial workflow, as that has been updated more recently.
Hi @benjjneb, thank you for the reply!
I was using the BigData one because is faster than the regular. As you can see the differences are not so big. Anyway, I will stick to the regular one, thank you :)
Hi @benjjneb,
I wondered if the two approaches used in big data and classic tutorials should lead to differences in data pooling outcomes. Is it possible to use
pool=TRUE
in the big data pipeline reliably? Could this lead to a less rich ASV output than the normal approach?Moreover, I read the latest issues about the pooling method and the chimaera removal method...if I'm using dada2 in R (not Qiime), using pooling for my dataset, should I select a different method for chimaera removal, than consensus?