fslaborg / FSharp.Stats

statistical testing, linear algebra, machine learning, fitting and signal processing in F#
https://fslab.org/FSharp.Stats/
Other
205 stars 54 forks source link

[Feature Request] Missing medianOfRatios normalization documenation #202

Closed bvenn closed 1 year ago

bvenn commented 2 years ago

A proper documentation of the medianOfRatios normalization should be added. From a mathematical perspective the calculation of the geometric mean as the nth square root of the product of all values (1) seems odd, but when displayed as mean of the log-transformed data (2) it becomes clear, that the geometric mean is just an outlier-insensitive measure of the mean, which is intuitive to do when dealing with biological data. No prior log-transform has to be applied before normalizing the data with this method. If required a log transform can be applied to the normalized values to restore homoscedasticity.

In short, you determine the (antilog of the (mean of the (values in log space)))!

bvenn commented 2 years ago

It may be beneficial to implement a second version of the median of ratios (mor) normalization method, that works on log transformed data. When the data is already log transformed to reduce heteroscedasticity, a normalization does not require a geometric mean, but a standard arithmetic mean.

The additional log transform in the default mor would stabilize the values disproportionately and therefore the resulting differencial normalization would be non-sufficient.

bvenn commented 2 years ago

Additionally, it would be great to have access to the applied scaling factors to check the validity of the normalization!