Closed slambrechts closed 1 month ago
@slambrechts you probably refer to NovaSeq's simplified quality encoding.
The short answer is: no known adverse effect yet.
Only marginal effects are known. For instance, vsearch
may report fastq quality average or median values that do not belong to the reduced set of quality values. vsearch
commands such as --fastq_mergepairs
recompute quality values, and thus may be more impacted. Nothing showed up in our tests so far.
@frederic-mahe ok great, thank you for the info. If I understand correctly, there is also no need to adjust maxee for fastq filtering?
Earlier this year, I've listed the following reduced sets of quality values (see issue #474):
These are subsets of usual quality sets, so I do not expect any particular difficulties for vsearch
.
Also no need to adjust maxee for fastq filtering?
When using --fastq_filter
, --fastq_mergepairs
or --fastx_filter
, option --fastq_maxee
discards sequences with an expected error greater than the specified value. There is no default value for --fastq_maxee
, so there is no adjustment to be done on at the code level. Also, the way --fastq_maxee
is computed (sum of 10^-(Q/10)) should not be impacted if a reduced set of quality values is used.
I could be wrong though, please feel free to suggest tests or configurations.
basic tests added to our test suite (https://github.com/frederic-mahe/vsearch-tests/commit/bd064e7bf942b7f9a83e3801041e58050342e4eb)
Hi,
I know there are consequences of using dada2 on NovaSeq data (e.g. https://github.com/benjjneb/dada2/issues/791), but do you know if there are similar problems with using vsearch on novaseq data?
Best, Sam