JuliaStats / StatsBase.jl

Basic statistics for Julia
Other
585 stars 191 forks source link

[feature request] allow `transform` to avoid Z-score transforming when sigma=0 #910

Open SimonEnsemble opened 9 months ago

SimonEnsemble commented 9 months ago

my design matrix Φ has a column of one's to handle the intercept. so anytime I Z-score transform, I get NaNs after dividing by a zero variance. would be nice if transform had an option to handle this, ie. not transform when sigma=0.0.

feature_transform = fit(ZScoreTransform, Φ, dims=1)
Φ̂ = StatsBase.transform(feature_transform, Φ)
SimonEnsemble commented 9 months ago

eg. in scikit-learn's StandardScaler: "If a variance is zero, we can’t achieve unit variance, and the data is left as-is, giving a scaling factor of 1."

https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html#sklearn.preprocessing.StandardScaler