felix-clark / ndarray-glm

Rust library for linear, logistic, and generalized linear model regression
MIT License
22 stars 0 forks source link

Data standardization could be internalized #34

Open felix-clark opened 1 year ago

felix-clark commented 1 year ago

The standardize utility function should probably be moved internally for at least two reasons.

  1. interface uniformity
  2. the ability to persist the transformation to be used downstream in Fit::predict() on external/test/validation data
felix-clark commented 1 year ago

I'm pretty sure this impacts the fit indirectly via regularization. It seems that R packages tend to do this automatically internally, which could explain some differing results in comparison tests. So, it may be a sensible default, but this effect should be thought about and documented.