Antonello Lobianco noticed that the outcome classes are typically floats (for example, 1.0 and 0.0) whereas integers would be much more suitable. For example, this was the output for the haberman dataset:
StableRules model with 8 rules:
if X[i, :nodes] < 8.0 then 0.156 else 0.031 +
if X[i, :nodes] < 14.0 then 0.164 else 0.026 +
if X[i, :nodes] < 4.0 then 0.128 else 0.037 +
if X[i, :nodes] ≥ 8.0 & X[i, :age] < 38.0 then 0.0 else 0.008 +
if X[i, :year] ≥ 1966.0 & X[i, :age] < 42.0 then 0.0 else 0.005 +
if X[i, :nodes] < 2.0 then 0.107 else 0.034 +
if X[i, :year] ≥ 1966.0 & X[i, :age] < 38.0 then 0.0 else 0.001 +
if X[i, :year] < 1959.0 & X[i, :nodes] ≥ 2.0 then 0.0 else 0.003
and 2 classes: [0.0, 1.0].
Note: showing only the probability for class 1.0 since class 0.0 has
probability 1 - p.
This PR changes that to
StableRules model with 8 rules:
if X[i, :nodes] < 8.0 then 0.156 else 0.031 +
if X[i, :nodes] < 14.0 then 0.164 else 0.026 +
if X[i, :nodes] < 4.0 then 0.128 else 0.037 +
if X[i, :nodes] ≥ 8.0 & X[i, :age] < 38.0 then 0.0 else 0.008 +
if X[i, :year] ≥ 1966.0 & X[i, :age] < 42.0 then 0.0 else 0.005 +
if X[i, :nodes] < 2.0 then 0.107 else 0.034 +
if X[i, :year] ≥ 1966.0 & X[i, :age] < 38.0 then 0.0 else 0.001 +
if X[i, :year] < 1959.0 & X[i, :nodes] ≥ 2.0 then 0.0 else 0.003
and 2 classes: [0, 1].
Note: showing only the probability for class 1 since class 0 has probability 1 - p.
Antonello Lobianco noticed that the outcome classes are typically floats (for example,
1.0
and0.0
) whereas integers would be much more suitable. For example, this was the output for the haberman dataset:This PR changes that to
Much better like this. Thank you, @sylvaticus!