PacBio 16S read lengths and counts query

benjjneb / dada2

Accurate sample inference from amplicon data with single nucleotide resolution

GNU Lesser General Public License v3.0

470 stars 142 forks source link

Hello @benjjneb,

I was just wondering if you would mind advising on the attached histogram created using the script from your DADA2 + PacBio: Fecal Samples tutorial.
Read length distribution of r210414_Cell5_Data We have sent 40 samples (vaginal and infant stool samples as well as their controls) to a service provider for 16S rRNA sequencing on the PacBio sequel II instrument. The instrument is relatively new and we have doubts about their sequencing capability and the quality of the data that is being returned to us. The data used to produced the histogram is only from the 14 samples they’ve manage to return to us.

We have never worked with PacBio 16S data before and we are concerned by the number of reads below 500bp. Is this normal or are these primer/adapter dimers that should have been cleaned up prior to sequencing? Secondly, how many reads do you think we should expect from these sample types? We had hoped for roughly 5000 reads – I’m not entirely sure if this is a realistic expectation.

Any assistance would be greatly appreciated!

Lauren

benjjneb / dada2

PacBio 16S read lengths and counts query #1395