JuliaStats / MultivariateStats.jl

A Julia package for multivariate statistics and data analysis (e.g. dimension reduction)
Other
375 stars 85 forks source link

Isotonic Regression question #206

Open FatemehTahavori opened 1 year ago

FatemehTahavori commented 1 year ago

Hi, I have generated a synthetic dataset in python and compared python implementation of Isotonic Regression (scikit-learn) vs julia (MultivariateStats.jl). The python script is : https://gist.github.com/FatemehTahavori/4885a1bb1a9fa2162d0044989d233e0a.js The Julia script is: https://gist.github.com/FatemehTahavori/158b0501545875861064192290516ed0.js Outputs do not seem to match using same dataset, I know there are different implementations of Isotonic Regression, I was wondering which implementation is used in MultivariateStats? and if this difference is because of that? Thanks

wildart commented 1 year ago

I implemented PAV algorithm from Best, M.J., Chakravarti, N. Active set algorithms for isotonic regression; A unifying framework. Mathematical Programming 47, 425–439 (1990).

As I recall, sk-learn implementation outputs response values for every regressors, but Julia's implementation outputs a model - a piecewise linear function bounds with values.

The implementation may need a review as I implemented it differently and more efficiently second time.