Closed lwaldron closed 3 months ago
Oh... contastWithinClassesOrFewPerClass
needs to be removed. Thanks for catching it!
It turns out there is a random number generation: https://github.com/waldronlab/lefser/blob/642ba43c633a1ceb28130e52a1ae7f5595e8f416/R/lefser.R#L68-L69
The createUniqueValues
function ensures that more than half of the values for each feature are unique. If that is not the case, then a count value is altered by adding it to a small value generated via normal distribution with mean=0 and sd=5% of the count value.
5d0fbdd
There are still functions/references to random subsets, but we are not using bootstrap any more. Are these out-of-date? See e.g.: https://github.com/waldronlab/lefser/blob/642ba43c633a1ceb28130e52a1ae7f5595e8f416/R/lefser.R#L77
On a related note, I am still seeing variation in LDA scores unless I set the random number seed. Is there still some sampling occurring somewhere??