rikhuijzer / SIRUS.jl

Interpretable Machine Learning via Rule Extraction
https://sirus.jl.huijzer.xyz/
MIT License
31 stars 2 forks source link

Inconsistent numerical results #48

Closed gdalle closed 1 year ago

gdalle commented 1 year ago

The numerical results reported in the JOSS paper are inconsistend with the latest CI benchmark https://github.com/rikhuijzer/SIRUS.jl/actions/runs/5972815140

https://github.com/openjournals/joss-reviews/issues/5786

rikhuijzer commented 1 year ago

Good catch. One problem was that I hadn't set the RNG for all DecisionTree runs. That was fixed in https://github.com/rikhuijzer/SIRUS.jl/commit/81533786b01c7f6b0cb562c3dda32b9e2156f767. Furthermore, I've tried to improve the inconsistencies between the CI runs in #53, but this wasn't successful. The most likely culprit at this point seems inconsistencies between hardware in the CI runs. Since results between runs on the same system are deterministic and since one issue is with DecisionTree.jl, I hope we can agree that consistency is good enough for now. Otherwise, feel free to reopen.

I've updated the source for the paper in https://github.com/rikhuijzer/SIRUS.jl/pull/36/commits/632d6b17acbbf056966f5a95e3c3b0fe4b313d98. Apart from the $0.69 \pm 0.09$ to $0.77 \pm 0.07$ jump for the Iris dataset, not much has changed (source: https://github.com/rikhuijzer/SIRUS.jl/actions/runs/6171051743).

gdalle commented 1 year ago

lgtm