Open davidefiocco opened 1 year ago
A possible improvement would be reworking the __str__
representation as
def __str__(self):
'''Print out the list in a nice way
'''
header = '> ------------------------------\n> Greedy Rule List\n> ------------------------------\n'
footer = '> ------------------------------\n'
rule_template = '> {condition} => {risk}% risk ({num_pts} pts)\n'
s = header
for i in range(len(self.rules_)):
rule = self.rules_[i]
condition = 'else'
risk = (100 * rule['val']).round(2)
num_pts = rule['num_pts']
if 'col' in rule:
predicate = '>=' if not rule['flip'] else '<'
if i == 0:
condition = f"if {rule['col']} {predicate} {rule['cutoff']}"
else:
condition = f"else if {rule['col']} {predicate} {rule['cutoff']}"
risk = (100 * rule['val_right']).round(2)
num_pts = rule['num_pts_right']
s += rule_template.format(
condition=condition,
risk=risk,
num_pts=num_pts
)
s += footer
return s
Which would render rules such as
[{'col': 'x2',
'index_col': 1,
'cutoff': 0.1395193189382553,
'val': 0.04092071611253197,
'flip': True,
'val_right': 0.9315403422982885,
'num_pts': 800,
'num_pts_right': 409},
{'col': 'x1',
'index_col': 0,
'cutoff': 0.0753365887212567,
'val': 0.010554089709762533,
'flip': False,
'val_right': 1.0,
'num_pts': 391,
'num_pts_right': 12},
{'col': 'x2',
'index_col': 1,
'cutoff': 0.19506534934043884,
'val': 0.0,
'flip': True,
'val_right': 0.16666666666666666,
'num_pts': 379,
'num_pts_right': 24},
{'val': 0.0, 'num_pts': 355}]
as
> ------------------------------
> Greedy Rule List
> ------------------------------
> if x2 < 0.1395193189382553 => 93.15% risk (409 pts)
> else if x1 >= 0.0753365887212567 => 100.0% risk (12 pts)
> else if x2 < 0.19506534934043884 => 16.67% risk (24 pts)
> else => 0.0% risk (355 pts)
> ------------------------------
Thanks, this is a nice fix!
I'll work on making it so that it displays like this if the feature is continuous-valued and keeps the original behavior for non-continuous features. Probably also worth rounding the cutoff value to ~3 decimal places.
When training
GreedyRulesListClassifier
on float features, and the fitted classiferclf
is printed, cutoff values are not shown, thus making the interpretation of the model a bit confusing. Here's an example:Trying to render the model with
print(clf)
yields something along the lines ofwhich I find confusing because
x1
andx2
are floats, not booleans.clf.rules_
are insteadand contain a cutoff that is useful for model interpretation. I don't know exactly what would be the desired intended behavior, as at the moment the code starting at https://github.com/csinva/imodels/blob/1243240fec3aae33852ba680ba6aea66a4f86ca7/imodels/rule_list/greedy_rule_list.py#L143-L184 contains commented chunks (also with colors, but not used).