Closed claraqin closed 4 years ago
Following @mykophile 's advice, I was able to increase the proportion of reads remaining from run B69PP to over 50% (median). This required that I also relax the minLen
argument such that minLen=20
. Is this too lenient?
Hi Clara - I think minLen 20 is too low. There are some yeasts with very small ITS sequences but my experience is that 20 bp is not enough to be useful in downstream steps, like OTU calling or taxonomic assignment. I think I would more comfortable with minLen = 50 or 100, but if most of the seqs are only 50 or 100 bp, then I think that is going to be problematic and I recommend rather keeping a lower fraction of higher quality sequences. I am playing around with some other filter and trim approaches on BMI Plate 3 and will let you know how that goes.
On Mar 28, 2020, at 8:02 PM, Clara Qin notifications@github.com<mailto:notifications@github.com> wrote:
Following @mykophilehttps://github.com/mykophile 's advice, I was able to increase the proportion of reads remaining from run B69PP to over 50% (median). This required that I also relax the minLen argument such that minLen=20. Is this too lenient?
[filterAndTrim_params_test]https://user-images.githubusercontent.com/12421420/77839117-e30dfb00-712e-11ea-8057-ac2cbb2e42eb.png
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/claraqin/NEON_soil_microbe_processing/issues/7#issuecomment-605551044, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AC5DJWVTGJZQ6DNZR6IJP7LRJ225XANCNFSM4LVZYJ3A.
Kabir Peay Associate Professor Dept. of Biology Stanford University (650) 723-0552
Closing this issue because we found out during the sensitivity analysis that we can increase the maxEE
parameter(s) without negative consequences to read merging, taxonomy assignment, etc. https://people.ucsc.edu/~claraqin/test_dada2_params_plots.html
maxEE
can also be varied across sequencing runs.
maxEE = 8
is a reasonable parameter to use for most NEON sequencing runs.
The
filterAndTrim
step of the DADA2 processing pipeline removes a majority of ITS reads from most samples in most sequencing runs. For example, afterfilterAndTrim
with the default parameters (maxN = 0, maxEE = c(2, 2), truncQ = 2, minLen = 50
), the median percentage of reads remaining in a sample from sequencing run B69PP is only 16%.@mykophile commented: