waldronlab / lefser

R implementation of the LEfSe method
https://waldronlab.io/lefser/
43 stars 6 forks source link

outdated references to random subsets #42

Closed lwaldron closed 3 months ago

lwaldron commented 3 months ago

There are still functions/references to random subsets, but we are not using bootstrap any more. Are these out-of-date? See e.g.: https://github.com/waldronlab/lefser/blob/642ba43c633a1ceb28130e52a1ae7f5595e8f416/R/lefser.R#L77

On a related note, I am still seeing variation in LDA scores unless I set the random number seed. Is there still some sampling occurring somewhere??

shbrief commented 3 months ago

Oh... contastWithinClassesOrFewPerClass needs to be removed. Thanks for catching it!

shbrief commented 3 months ago

It turns out there is a random number generation: https://github.com/waldronlab/lefser/blob/642ba43c633a1ceb28130e52a1ae7f5595e8f416/R/lefser.R#L68-L69

The createUniqueValues function ensures that more than half of the values for each feature are unique. If that is not the case, then a count value is altered by adding it to a small value generated via normal distribution with mean=0 and sd=5% of the count value.

shbrief commented 3 months ago

5d0fbdd