benjjneb / dada2

Accurate sample inference from amplicon data with single nucleotide resolution
http://benjjneb.github.io/dada2/
GNU Lesser General Public License v3.0
460 stars 141 forks source link

Where to start the DADA2, when your sequence are already quality filtered by the sequencing company? #1576

Closed isuvodeep closed 1 month ago

isuvodeep commented 2 years ago

Hi, I was looking at some sequences like- https://www.ebi.ac.uk/ena/browser/view/SRR13756225?show=reads

Where there is one sequencing file from Illumina MiSeq, and in the published paper it mentioned that - The raw sequence data were obtained from the full-length nifH gene sequence after quality control.

The sequence files are like that @SRR13756225.1 1/1 GGTCGTCTGCGCGAGGCCATGGCAGGAGATGATGCAGGGCAGGCCCGCCTTGATAAGATCACTCAGCTGATTGCCGACAGCATGGGCACCGAAGTGTGCTCCATCTATCTGTTTCGCGACGAAGAAACACTGGAACTCTGCGCCACTGAAGGTCTGAACCGCGAATCCGTTCACCAGACCCGTATGCGTGTGGGCGAGGGGCTGGTGGGGCGCGTGGCGCGCACCGGCAAGGTCATCAACACCCCCGACGCCCCCAGCGCGCGTGGCTTTCGCTATATGCCAGAGACCGGAGAGGAGCGGTTTTCCCCCTTCCTTGGTATCCCGGTC + GGFGGGGGGGGGGGGGGGGGGGGGGGEGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG

The last part is similar to all the following reads of this sample.

So, I am a bit confused.

So, now If I want to analyze those sequences, where do I have to start in DADA2?

Thank you in advance.

benjjneb commented 2 years ago

Start at the same place. Presumably you won't lose many reads at the filtering step.