benjjneb / dada2

Accurate sample inference from amplicon data with single nucleotide resolution
http://benjjneb.github.io/dada2/
GNU Lesser General Public License v3.0
459 stars 142 forks source link

ASV question for sequences of eukaryotes and plants #507

Closed paulroldanolarte closed 6 years ago

paulroldanolarte commented 6 years ago

My name is Paul Roldan Olarte and I am an undergrad student at Universidad El Bosque in Bogota, Colombia. I'm currently analyzing metabarcoding sequences in DADA2. These sequences correspond to plants (marker tRNL) and Eukaryota (marker 18S) and have a variable range of length from 50 to 150 bp.

My questions are:

Thank you for your attention.

Paul Roldán Olarte Student of Bioengineering - Universidad El Bosque Bogota, Colombia

benjjneb commented 6 years ago

Is it possible to analize these kind of sequences with DADA for generating ASV tables ?

Yes, outside of taxonomic assignment, nothing in DADA2 is specific to a particular gene.

if It is possible. How does the variability of the sequences affect the learning error? or should I cut all sequences to the same lenght ?

Assuming that is biological length variation, you're fine. Just make sure that you've removed any primers and adapters, which may require an outside program as the trimming in DADA2 doesn't support removing variable-length primers or primers at the ends of sequences.