Too many ASVs for downstream analysis, ways to filter out data

benjjneb / dada2

Accurate sample inference from amplicon data with single nucleotide resolution

GNU Lesser General Public License v3.0

469 stars 142 forks source link

Yes it is valid to filter further, as long as the filtering you are doing is not aware of any subsequent inferential analysis you might be doing. See this paper for more rigorous justification: Independent filtering increases detection power for high-throughput experiments

The ideal way to further filter down your feature set (i.e. number of taxa) depends a bit on the next questions you want to task. If for example differential abundance testing is important, I would start by fitering out taxa present in few samples or at very low abundance, as even if those taxa were associated with the condition of interest, they wouldn't meet any relevant statistical threshold anyway.

benjjneb / dada2

Too many ASVs for downstream analysis, ways to filter out data #1285