microbiome / mia

Microbiome analysis
https://microbiome.github.io/mia/
Artistic License 2.0
45 stars 25 forks source link

Use vegan in rarefy #561

Open TuomasBorman opened 1 month ago

TuomasBorman commented 1 month ago

Should we switch using vegan::rrafefy (it is fast since it is programmed with C)? One downside is that it does not support replace. Moreover, one thing to discuss (as I am not the expert of the terminology), could it make more sense to do iterations? As I have understood, the criticism towards rafefying is mostly related to bias and missing data caused by rarefaction single time. However, if enough iterations is done, that would solve the issue.

@antagomir

antagomir commented 1 month ago
  1. vegan::rrarefy : missing support for replace is not good (rarefaction is often done with replacement) but perhaps not so critical; our case does seem prohibitively slow and subsampling is a simple operation so I am not sure if this is needed. Fast is good but it would help to know how much faster, if there is real advantage (in this or other regards).
  2. iterations: this was validated by Pat Schloss in the context of alpha and beta diversity calculation over several iterations and those PRs are already open. I am not familiar with supporting iterations to just get rarefied data. The iterations would not solve the issue that the samples with more reads will have a higher resolution for that information even after iterations. I would refrain from implementing this until there is reference or experiments demonstrating benefits in a general case.