I have a pool of noisy sequences with two variants that differ at indels in their ends (see figure).
dada with default parameters only retrieves one variant but misses the other. I think it has to do with the ends-free concept in the alignments. After playing with the alignment parameters, setting "BAND_SIZE" = 0, seems to trigger nwalign_gapless in the C++ code and both variants are retrieved, along with other variants that are false positives at low frequencies and very low p.values (birt_pval).
For comparison, I run in R the Needleman-Wunsch global alignment of these two sequences, which detects the end-gaps.
What exactly does "BAND_SIZE" = 0? how could I tune dada parameters to account for this variation at the end of the sequences without compromising the false negatives and false positives for other variants?
Thanks,
Miguel
I have a pool of noisy sequences with two variants that differ at indels in their ends (see figure).
dada
with default parameters only retrieves one variant but misses the other. I think it has to do with the ends-free concept in the alignments. After playing with the alignment parameters, setting"BAND_SIZE" = 0
, seems to triggernwalign_gapless
in the C++ code and both variants are retrieved, along with other variants that are false positives at low frequencies and very low p.values (birt_pval). For comparison, I run in R the Needleman-Wunsch global alignment of these two sequences, which detects the end-gaps.I run AmpliCI and it also finds both variants:
What exactly does
"BAND_SIZE" = 0
? how could I tune dada parameters to account for this variation at the end of the sequences without compromising the false negatives and false positives for other variants? Thanks, Miguel