rikhuijzer / SIRUS.jl

Interpretable Machine Learning via Rule Extraction
https://sirus.jl.huijzer.xyz/
MIT License
30 stars 2 forks source link

Improve reproducibility #53

Closed rikhuijzer closed 12 months ago

rikhuijzer commented 12 months ago

In an attempt to solve https://github.com/rikhuijzer/SIRUS.jl/issues/48, I've set more RNGs in https://github.com/rikhuijzer/SIRUS.jl/commit/81533786b01c7f6b0cb562c3dda32b9e2156f767. However, after doing two CI runs, the difference in outcomes are as follows:

5c5
<    1 │ haberman         DecisionTreeClassifier  (;)                              auc       0.56    0.07         10
---
>    1 │ haberman         DecisionTreeClassifier  (;)                              auc       0.53    0.07         10
38,39c38,39
<   34 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 30)  accuracy  0.69    0.10         10
<   35 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 10)  accuracy  0.67    0.08         10
---
>   34 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 30)  accuracy  0.74    0.14         10
>   35 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 10)  accuracy  0.69    0.10         10

Even worse, these results are again different when running locally. Locally, doing multiple runs always produces the same result, so it looks like different systems can give different results.

rikhuijzer commented 12 months ago

First run (effaaf6); all on Julia 1 (v1.9.3):

49×7 DataFrame
 Row │ Dataset          Model                   Hyperparameters                  measure   score   1.96*SE  nfolds
     │ String           String                  String                           String    String  String   Int64
─────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1 │ haberman         DecisionTreeClassifier  (;)                              auc       0.54    0.05         10
   2 │ haberman         LogisticClassifier      (;)                              auc       0.69    0.06         10
   3 │ haberman         XGBoostClassifier       (;)                              auc       0.65    0.04         10
   4 │ haberman         XGBoostClassifier       (max_depth = 2,)                 auc       0.63    0.04         10
   5 │ haberman         StableForestClassifier  (max_depth = 2,)                 auc       0.71    0.05         10
   6 │ haberman         StableRulesClassifier   (max_depth = 2, max_rules = 30)  auc       0.70    0.08         10
   7 │ haberman         StableRulesClassifier   (max_depth = 2, max_rules = 10)  auc       0.67    0.07         10
   8 │ titanic          DecisionTreeClassifier  (;)                              auc       0.76    0.05         10
   9 │ titanic          LogisticClassifier      (;)                              auc       0.84    0.02         10
  10 │ titanic          XGBoostClassifier       (;)                              auc       0.86    0.03         10
  11 │ titanic          XGBoostClassifier       (max_depth = 2,)                 auc       0.87    0.02         10
  12 │ titanic          StableForestClassifier  (max_depth = 2,)                 auc       0.85    0.02         10
  13 │ titanic          StableRulesClassifier   (max_depth = 2, max_rules = 30)  auc       0.83    0.02         10
  14 │ titanic          StableRulesClassifier   (max_depth = 2, max_rules = 10)  auc       0.82    0.02         10
  15 │ cancer           DecisionTreeClassifier  (;)                              auc       0.92    0.03         10
  16 │ cancer           MultinomialClassifier   (;)                              auc       0.98    0.01         10
  17 │ cancer           XGBoostClassifier       (;)                              auc       0.99    0.01         10
  18 │ cancer           XGBoostClassifier       (max_depth = 2,)                 auc       0.99    0.01         10
  19 │ cancer           StableForestClassifier  (max_depth = 2,)                 auc       0.98    0.01         10
  20 │ cancer           StableRulesClassifier   (max_depth = 2, max_rules = 30)  auc       0.98    0.01         10
  21 │ cancer           StableRulesClassifier   (max_depth = 2, max_rules = 10)  auc       0.98    0.01         10
  22 │ diabetes         DecisionTreeClassifier  (;)                              auc       0.67    0.05         10
  23 │ diabetes         LogisticClassifier      (;)                              auc       0.70    0.06         10
  24 │ diabetes         XGBoostClassifier       (;)                              auc       0.80    0.03         10
  25 │ diabetes         XGBoostClassifier       (max_depth = 2,)                 auc       0.83    0.03         10
  26 │ diabetes         StableForestClassifier  (max_depth = 2,)                 auc       0.82    0.03         10
  27 │ diabetes         StableRulesClassifier   (max_depth = 2, max_rules = 30)  auc       0.77    0.03         10
  28 │ diabetes         StableRulesClassifier   (max_depth = 2, max_rules = 10)  auc       0.75    0.05         10
  29 │ iris             DecisionTreeClassifier  (;)                              accuracy  0.95    0.03         10
  30 │ iris             MultinomialClassifier   (;)                              accuracy  0.97    0.03         10
  31 │ iris             XGBoostClassifier       (;)                              accuracy  0.95    0.04         10
  32 │ iris             XGBoostClassifier       (max_depth = 2,)                 accuracy  0.94    0.04         10
  33 │ iris             StableForestClassifier  (max_depth = 2,)                 accuracy  0.95    0.04         10
  34 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 30)  accuracy  0.74    0.14         10
  35 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 10)  accuracy  0.71    0.08         10
  36 │ boston           DecisionTreeRegressor   (;)                              R²        0.74    0.11         10
  37 │ boston           LinearRegressor         (;)                              R²        0.70    0.05         10
  38 │ boston           XGBoostRegressor        (;)                              R²        0.88    0.06         10
  39 │ boston           XGBoostRegressor        (max_depth = 2,)                 R²        0.87    0.04         10
  40 │ boston           StableForestRegressor   (max_depth = 2,)                 R²        0.67    0.08         10
  41 │ boston           StableRulesRegressor    (max_depth = 2, max_rules = 30)  R²        0.52    0.07         10
  42 │ boston           StableRulesRegressor    (max_depth = 2, max_rules = 10)  R²        0.63    0.10         10
  43 │ make_regression  DecisionTreeRegressor   (;)                              R²        0.90    0.02         10
  44 │ make_regression  LinearRegressor         (;)                              R²        1.00    0.00         10
  45 │ make_regression  XGBoostRegressor        (;)                              R²        0.98    0.01         10
  46 │ make_regression  XGBoostRegressor        (max_depth = 2,)                 R²        0.98    0.00         10
  47 │ make_regression  StableForestRegressor   (max_depth = 2,)                 R²        0.67    0.05         10
  48 │ make_regression  StableRulesRegressor    (max_depth = 2, max_rules = 30)  R²        0.48    0.05         10
  49 │ make_regression  StableRulesRegressor    (max_depth = 2, max_rules = 10)  R²        0.53    0.06         10

second run (87bc933):

49×7 DataFrame
 Row │ Dataset          Model                   Hyperparameters                  measure   score   1.96*SE  nfolds
     │ String           String                  String                           String    String  String   Int64
─────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1 │ haberman         DecisionTreeClassifier  (;)                              auc       0.55    0.06         10
   2 │ haberman         LogisticClassifier      (;)                              auc       0.69    0.06         10
   3 │ haberman         XGBoostClassifier       (;)                              auc       0.65    0.04         10
   4 │ haberman         XGBoostClassifier       (max_depth = 2,)                 auc       0.63    0.04         10
   5 │ haberman         StableForestClassifier  (max_depth = 2,)                 auc       0.71    0.05         10
   6 │ haberman         StableRulesClassifier   (max_depth = 2, max_rules = 30)  auc       0.70    0.08         10
   7 │ haberman         StableRulesClassifier   (max_depth = 2, max_rules = 10)  auc       0.67    0.07         10
   8 │ titanic          DecisionTreeClassifier  (;)                              auc       0.76    0.05         10
   9 │ titanic          LogisticClassifier      (;)                              auc       0.84    0.02         10
  10 │ titanic          XGBoostClassifier       (;)                              auc       0.86    0.03         10
  11 │ titanic          XGBoostClassifier       (max_depth = 2,)                 auc       0.87    0.02         10
  12 │ titanic          StableForestClassifier  (max_depth = 2,)                 auc       0.85    0.02         10
  13 │ titanic          StableRulesClassifier   (max_depth = 2, max_rules = 30)  auc       0.83    0.02         10
  14 │ titanic          StableRulesClassifier   (max_depth = 2, max_rules = 10)  auc       0.82    0.02         10
  15 │ cancer           DecisionTreeClassifier  (;)                              auc       0.92    0.03         10
  16 │ cancer           MultinomialClassifier   (;)                              auc       0.98    0.01         10
  17 │ cancer           XGBoostClassifier       (;)                              auc       0.99    0.01         10
  18 │ cancer           XGBoostClassifier       (max_depth = 2,)                 auc       0.99    0.01         10
  19 │ cancer           StableForestClassifier  (max_depth = 2,)                 auc       0.98    0.01         10
  20 │ cancer           StableRulesClassifier   (max_depth = 2, max_rules = 30)  auc       0.98    0.01         10
  21 │ cancer           StableRulesClassifier   (max_depth = 2, max_rules = 10)  auc       0.98    0.01         10
  22 │ diabetes         DecisionTreeClassifier  (;)                              auc       0.67    0.05         10
  23 │ diabetes         LogisticClassifier      (;)                              auc       0.70    0.06         10
  24 │ diabetes         XGBoostClassifier       (;)                              auc       0.80    0.03         10
  25 │ diabetes         XGBoostClassifier       (max_depth = 2,)                 auc       0.83    0.03         10
  26 │ diabetes         StableForestClassifier  (max_depth = 2,)                 auc       0.82    0.03         10
  27 │ diabetes         StableRulesClassifier   (max_depth = 2, max_rules = 30)  auc       0.77    0.03         10
  28 │ diabetes         StableRulesClassifier   (max_depth = 2, max_rules = 10)  auc       0.75    0.05         10
  29 │ iris             DecisionTreeClassifier  (;)                              accuracy  0.95    0.03         10
  30 │ iris             MultinomialClassifier   (;)                              accuracy  0.97    0.03         10
  31 │ iris             XGBoostClassifier       (;)                              accuracy  0.95    0.04         10
  32 │ iris             XGBoostClassifier       (max_depth = 2,)                 accuracy  0.94    0.04         10
  33 │ iris             StableForestClassifier  (max_depth = 2,)                 accuracy  0.95    0.04         10
  34 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 30)  accuracy  0.76    0.14         10
  35 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 10)  accuracy  0.67    0.08         10
  36 │ boston           DecisionTreeRegressor   (;)                              R²        0.74    0.11         10
  37 │ boston           LinearRegressor         (;)                              R²        0.70    0.05         10
  38 │ boston           XGBoostRegressor        (;)                              R²        0.88    0.06         10
  39 │ boston           XGBoostRegressor        (max_depth = 2,)                 R²        0.87    0.04         10
  40 │ boston           StableForestRegressor   (max_depth = 2,)                 R²        0.67    0.08         10
  41 │ boston           StableRulesRegressor    (max_depth = 2, max_rules = 30)  R²        0.52    0.07         10
  42 │ boston           StableRulesRegressor    (max_depth = 2, max_rules = 10)  R²        0.63    0.10         10
  43 │ make_regression  DecisionTreeRegressor   (;)                              R²        0.90    0.02         10
  44 │ make_regression  LinearRegressor         (;)                              R²        1.00    0.00         10
  45 │ make_regression  XGBoostRegressor        (;)                              R²        0.98    0.01         10
  46 │ make_regression  XGBoostRegressor        (max_depth = 2,)                 R²        0.98    0.00         10
  47 │ make_regression  StableForestRegressor   (max_depth = 2,)                 R²        0.67    0.05         10
  48 │ make_regression  StableRulesRegressor    (max_depth = 2, max_rules = 30)  R²        0.48    0.05         10
  49 │ make_regression  StableRulesRegressor    (max_depth = 2, max_rules = 10)  R²        0.53    0.06         10

Diff:

5c5
<    1 │ haberman         DecisionTreeClassifier  (;)                              auc       0.54    0.05         10
---
>    1 │ haberman         DecisionTreeClassifier  (;)                              auc       0.55    0.06         10
38,39c38,39
<   34 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 30)  accuracy  0.74    0.14         10
<   35 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 10)  accuracy  0.71    0.08         10
---
>   34 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 30)  accuracy  0.76    0.14         10
>   35 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 10)  accuracy  0.67    0.08         10

So the only different runs are the DecisionTree for the Haberman dataset and the StableRulesClassifier for the Iris dataset. These two were also the only differences in two CI runs on main against https://github.com/rikhuijzer/SIRUS.jl/commit/81533786b01c7f6b0cb562c3dda32b9e2156f767.

rikhuijzer commented 12 months ago

First (a148080):

49×7 DataFrame
 Row │ Dataset          Model                   Hyperparameters                  measure   score   1.96*SE  nfolds
     │ String           String                  String                           String    String  String   Int64
─────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1 │ haberman         DecisionTreeClassifier  (;)                              auc       0.56    0.06         10
   2 │ haberman         LogisticClassifier      (;)                              auc       0.69    0.06         10
   3 │ haberman         XGBoostClassifier       (;)                              auc       0.65    0.04         10
   4 │ haberman         XGBoostClassifier       (max_depth = 2,)                 auc       0.63    0.04         10
   5 │ haberman         StableForestClassifier  (max_depth = 2,)                 auc       0.70    0.05         10
   6 │ haberman         StableRulesClassifier   (max_depth = 2, max_rules = 30)  auc       0.70    0.07         10
   7 │ haberman         StableRulesClassifier   (max_depth = 2, max_rules = 10)  auc       0.67    0.06         10
   8 │ titanic          DecisionTreeClassifier  (;)                              auc       0.76    0.05         10
   9 │ titanic          LogisticClassifier      (;)                              auc       0.84    0.02         10
  10 │ titanic          XGBoostClassifier       (;)                              auc       0.86    0.03         10
  11 │ titanic          XGBoostClassifier       (max_depth = 2,)                 auc       0.87    0.02         10
  12 │ titanic          StableForestClassifier  (max_depth = 2,)                 auc       0.85    0.02         10
  13 │ titanic          StableRulesClassifier   (max_depth = 2, max_rules = 30)  auc       0.83    0.02         10
  14 │ titanic          StableRulesClassifier   (max_depth = 2, max_rules = 10)  auc       0.83    0.02         10
  15 │ cancer           DecisionTreeClassifier  (;)                              auc       0.92    0.03         10
  16 │ cancer           MultinomialClassifier   (;)                              auc       0.98    0.01         10
  17 │ cancer           XGBoostClassifier       (;)                              auc       0.99    0.01         10
  18 │ cancer           XGBoostClassifier       (max_depth = 2,)                 auc       0.99    0.01         10
  19 │ cancer           StableForestClassifier  (max_depth = 2,)                 auc       0.99    0.01         10
  20 │ cancer           StableRulesClassifier   (max_depth = 2, max_rules = 30)  auc       0.98    0.01         10
  21 │ cancer           StableRulesClassifier   (max_depth = 2, max_rules = 10)  auc       0.98    0.01         10
  22 │ diabetes         DecisionTreeClassifier  (;)                              auc       0.67    0.05         10
  23 │ diabetes         LogisticClassifier      (;)                              auc       0.70    0.06         10
  24 │ diabetes         XGBoostClassifier       (;)                              auc       0.80    0.03         10
  25 │ diabetes         XGBoostClassifier       (max_depth = 2,)                 auc       0.83    0.03         10
  26 │ diabetes         StableForestClassifier  (max_depth = 2,)                 auc       0.82    0.03         10
  27 │ diabetes         StableRulesClassifier   (max_depth = 2, max_rules = 30)  auc       0.78    0.04         10
  28 │ diabetes         StableRulesClassifier   (max_depth = 2, max_rules = 10)  auc       0.75    0.05         10
  29 │ iris             DecisionTreeClassifier  (;)                              accuracy  0.95    0.03         10
  30 │ iris             MultinomialClassifier   (;)                              accuracy  0.97    0.03         10
  31 │ iris             XGBoostClassifier       (;)                              accuracy  0.95    0.04         10
  32 │ iris             XGBoostClassifier       (max_depth = 2,)                 accuracy  0.94    0.04         10
  33 │ iris             StableForestClassifier  (max_depth = 2,)                 accuracy  0.95    0.04         10
  34 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 30)  accuracy  0.87    0.12         10
  35 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 10)  accuracy  0.75    0.07         10
  36 │ boston           DecisionTreeRegressor   (;)                              R²        0.74    0.11         10
  37 │ boston           LinearRegressor         (;)                              R²        0.70    0.05         10
  38 │ boston           XGBoostRegressor        (;)                              R²        0.88    0.06         10
  39 │ boston           XGBoostRegressor        (max_depth = 2,)                 R²        0.87    0.04         10
  40 │ boston           StableForestRegressor   (max_depth = 2,)                 R²        0.67    0.09         10
  41 │ boston           StableRulesRegressor    (max_depth = 2, max_rules = 30)  R²        0.57    0.08         10
  42 │ boston           StableRulesRegressor    (max_depth = 2, max_rules = 10)  R²        0.61    0.09         10
  43 │ make_regression  DecisionTreeRegressor   (;)                              R²        0.90    0.02         10
  44 │ make_regression  LinearRegressor         (;)                              R²        1.00    0.00         10
  45 │ make_regression  XGBoostRegressor        (;)                              R²        0.98    0.01         10
  46 │ make_regression  XGBoostRegressor        (max_depth = 2,)                 R²        0.98    0.00         10
  47 │ make_regression  StableForestRegressor   (max_depth = 2,)                 R²        0.68    0.05         10
  48 │ make_regression  StableRulesRegressor    (max_depth = 2, max_rules = 30)  R²        0.46    0.05         10
  49 │ make_regression  StableRulesRegressor    (max_depth = 2, max_rules = 10)  R²        0.53    0.05         10

second:

49×7 DataFrame
 Row │ Dataset          Model                   Hyperparameters                  measure   score   1.96*SE  nfolds
     │ String           String                  String                           String    String  String   Int64
─────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1 │ haberman         DecisionTreeClassifier  (;)                              auc       0.55    0.05         10
   2 │ haberman         LogisticClassifier      (;)                              auc       0.69    0.06         10
   3 │ haberman         XGBoostClassifier       (;)                              auc       0.65    0.04         10
   4 │ haberman         XGBoostClassifier       (max_depth = 2,)                 auc       0.63    0.04         10
   5 │ haberman         StableForestClassifier  (max_depth = 2,)                 auc       0.70    0.05         10
   6 │ haberman         StableRulesClassifier   (max_depth = 2, max_rules = 30)  auc       0.70    0.07         10
   7 │ haberman         StableRulesClassifier   (max_depth = 2, max_rules = 10)  auc       0.67    0.06         10
   8 │ titanic          DecisionTreeClassifier  (;)                              auc       0.76    0.05         10
   9 │ titanic          LogisticClassifier      (;)                              auc       0.84    0.02         10
  10 │ titanic          XGBoostClassifier       (;)                              auc       0.86    0.03         10
  11 │ titanic          XGBoostClassifier       (max_depth = 2,)                 auc       0.87    0.02         10
  12 │ titanic          StableForestClassifier  (max_depth = 2,)                 auc       0.85    0.02         10
  13 │ titanic          StableRulesClassifier   (max_depth = 2, max_rules = 30)  auc       0.83    0.02         10
  14 │ titanic          StableRulesClassifier   (max_depth = 2, max_rules = 10)  auc       0.83    0.02         10
  15 │ cancer           DecisionTreeClassifier  (;)                              auc       0.92    0.03         10
  16 │ cancer           MultinomialClassifier   (;)                              auc       0.98    0.01         10
  17 │ cancer           XGBoostClassifier       (;)                              auc       0.99    0.01         10
  18 │ cancer           XGBoostClassifier       (max_depth = 2,)                 auc       0.99    0.01         10
  19 │ cancer           StableForestClassifier  (max_depth = 2,)                 auc       0.99    0.01         10
  20 │ cancer           StableRulesClassifier   (max_depth = 2, max_rules = 30)  auc       0.98    0.01         10
  21 │ cancer           StableRulesClassifier   (max_depth = 2, max_rules = 10)  auc       0.98    0.01         10
  22 │ diabetes         DecisionTreeClassifier  (;)                              auc       0.67    0.05         10
  23 │ diabetes         LogisticClassifier      (;)                              auc       0.70    0.06         10
  24 │ diabetes         XGBoostClassifier       (;)                              auc       0.80    0.03         10
  25 │ diabetes         XGBoostClassifier       (max_depth = 2,)                 auc       0.83    0.03         10
  26 │ diabetes         StableForestClassifier  (max_depth = 2,)                 auc       0.82    0.03         10
  27 │ diabetes         StableRulesClassifier   (max_depth = 2, max_rules = 30)  auc       0.78    0.04         10
  28 │ diabetes         StableRulesClassifier   (max_depth = 2, max_rules = 10)  auc       0.75    0.05         10
  29 │ iris             DecisionTreeClassifier  (;)                              accuracy  0.95    0.03         10
  30 │ iris             MultinomialClassifier   (;)                              accuracy  0.97    0.03         10
  31 │ iris             XGBoostClassifier       (;)                              accuracy  0.95    0.04         10
  32 │ iris             XGBoostClassifier       (max_depth = 2,)                 accuracy  0.94    0.04         10
  33 │ iris             StableForestClassifier  (max_depth = 2,)                 accuracy  0.95    0.04         10
  34 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 30)  accuracy  0.75    0.15         10
  35 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 10)  accuracy  0.70    0.08         10
  36 │ boston           DecisionTreeRegressor   (;)                              R²        0.74    0.11         10
  37 │ boston           LinearRegressor         (;)                              R²        0.70    0.05         10
  38 │ boston           XGBoostRegressor        (;)                              R²        0.88    0.06         10
  39 │ boston           XGBoostRegressor        (max_depth = 2,)                 R²        0.87    0.04         10
  40 │ boston           StableForestRegressor   (max_depth = 2,)                 R²        0.67    0.09         10
  41 │ boston           StableRulesRegressor    (max_depth = 2, max_rules = 30)  R²        0.57    0.08         10
  42 │ boston           StableRulesRegressor    (max_depth = 2, max_rules = 10)  R²        0.61    0.09         10
  43 │ make_regression  DecisionTreeRegressor   (;)                              R²        0.90    0.02         10
  44 │ make_regression  LinearRegressor         (;)                              R²        1.00    0.00         10
  45 │ make_regression  XGBoostRegressor        (;)                              R²        0.98    0.01         10
  46 │ make_regression  XGBoostRegressor        (max_depth = 2,)                 R²        0.98    0.00         10
  47 │ make_regression  StableForestRegressor   (max_depth = 2,)                 R²        0.68    0.05         10
  48 │ make_regression  StableRulesRegressor    (max_depth = 2, max_rules = 30)  R²        0.46    0.05         10
  49 │ make_regression  StableRulesRegressor    (max_depth = 2, max_rules = 10)  R²        0.53    0.05         10

Diff:

5c5
<    1 │ haberman         DecisionTreeClassifier  (;)                              auc       0.56    0.06         10
---
>    1 │ haberman         DecisionTreeClassifier  (;)                              auc       0.55    0.05         10
38,39c38,39
<   34 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 30)  accuracy  0.87    0.12         10
<   35 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 10)  accuracy  0.75    0.07         10
---
>   34 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 30)  accuracy  0.75    0.15         10
>   35 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 10)  accuracy  0.70    0.08         10

The same problem occurs on Julia 1.6. It looks like none of the versions are stable.

rikhuijzer commented 12 months ago

Removed SIMD since it might cause different results on different systems. Still no fix though:

5c5
<    1 │ haberman         DecisionTreeClassifier  (;)                              auc       0.53    0.06         10
---
>    1 │ haberman         DecisionTreeClassifier  (;)                              auc       0.54    0.06         10
38,39c38,39
<   34 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 30)  accuracy  0.80    0.10         10
<   35 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 10)  accuracy  0.77    0.07         10
---
>   34 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 30)  accuracy  0.76    0.13         10
>   35 │ iris             StableRulesClassifier   (max_depth = 2, max_rules = 10)  accuracy  0.67    0.07         10
46c46
<   42 │ boston           StableRulesRegressor    (max_depth = 2, max_rules = 10)  R²        0.62    0.09         10
---
>   42 │ boston           StableRulesRegressor    (max_depth = 2, max_rules = 10)  R²        0.61    0.09         10

It could even be that SIMD makes things more deterministic because SIMD behavior is at least guaranteed across systems.

I'm just gonna put some obvious performance issues back and merge this afterwards.