Closed marwa38 closed 2 years ago
Controlling for library size effects is critical for a number of types of microbiome analysis. There's a pretty large literature on this. Rarefaction (subsampling to a fixed library size across samples) is common. There are other approaches specific to e.g. differential abundance analysis that can be more powerful than rarefying. For some beta-diversity methods, just making sure you convert to proportions can be sufficient.
What do you think at what threshold I should say that those samples need to be removed?
Plot a histogram of your sampling depths. Commonly most of the samples will have a comparable numebr of reads, while a few will have far fewer reads (libraries that didn't form well). Choose your threshold accordingly. Absolute numbers are less important as long as you aren't getting super low numbers (e.g. < 1000).
Many thanks @benjjneb
For some beta-diversity methods, just making sure you convert to proportions can be sufficient.
By proportions you mean relative abundance data?
ps.ra <- transform_sample_counts(ps, function(ASV) ASV/sum(ASV))
The function you describe is creating proportions, which are one specific scaling of relative abundance data.
Hello
Could you please advise if you think it is a good practice to subset samples that have a higher number of reads in comparison to other samples? after running dada2 for downstream analysis. As it seems that more reads mean more features comparatively.
What do you think at what threshold I should say that those samples need to be removed? regarding the number of reads in the sample comparatively to other samples
Cheers Marwa