Data standardization could be internalized

felix-clark / ndarray-glm

Rust library for linear, logistic, and generalized linear model regression

MIT License

22 stars 0 forks source link

Data standardization could be internalized #34

Open felix-clark opened 1 year ago

felix-clark commented 1 year ago

The standardize utility function should probably be moved internally for at least two reasons.

interface uniformity
the ability to persist the transformation to be used downstream in Fit::predict() on external/test/validation data

felix-clark commented 1 year ago

I'm pretty sure this impacts the fit indirectly via regularization. It seems that R packages tend to do this automatically internally, which could explain some differing results in comparison tests. So, it may be a sensible default, but this effect should be thought about and documented.