vmikk / metagMisc

Miscellaneous functions for metagenomic analysis.
MIT License
46 stars 11 forks source link

information about phyloseq_mult_raref #7

Closed shashankx closed 5 years ago

shashankx commented 5 years ago

I was planning to do multiple rarefactions, but I am unable to get the details about why rarefaction will be made for the depth equal to 0.9 * minimal observed sample size?

What is the significance behind this?

vmikk commented 5 years ago

Hello! This is just an arbitrary threshold.

For the sample with the minimal observed number of reads, rarefaction does not make much sense because you will obtain the same data (unless you are using resampling with replacement) and the variability will be zero (which is counterintuitive). That's why I decided to perform rarefaction for a slightly smaller number of reads.

However, I advise you to manually specify the desired sample size. Prior to this you may check the distribution of the sequencing depths (e.g., with plot(sort(sample_sums(physeq)))) and if your plot will look like this: Rplot maybe it'll be better to discard the smallest sample and use some pretty value (e.g., 500 in that case).

HTH, Vladimir

vmikk commented 5 years ago

I hope the issue has been resolved and I can close it now.