scikit-learn-contrib / skope-rules

machine learning with logical rules in Python
http://skope-rules.readthedocs.io
Other
617 stars 95 forks source link

Skope Rules should accept any kind of feature name #2

Closed floriangardin closed 6 years ago

floriangardin commented 6 years ago

SkopeRules uses pandas.eval method for evaluating semantic rules. It leads to error when features have meaningful characters in their name (eg: (,)=- ). For example :

from sklearn.datasets import load_iris
from skrules import SkopeRules
dataset = load_iris()

X, y, features_names = dataset.data, dataset.target, dataset.feature_names
y = (y == 0)  # Predicting the first specy vs all
clf = SkopeRules(max_depth_duplication=2,
                 n_estimators=30,
                 precision_min=0.3,
                 recall_min=0.1,
                 feature_names=features_names)
clf.fit(X, y)

will lead to following error :

Traceback (most recent call last):
  File "main.py", line 20, in <module>
    clf.fit(X, y)
  File "/usr/local/lib/python3.6/site-packages/skrules/skope_rules.py", line 350, in fit
    for r in set(rules_from_tree)]
  File "/usr/local/lib/python3.6/site-packages/skrules/skope_rules.py", line 350, in <listcomp>
    for r in set(rules_from_tree)]
  File "/usr/local/lib/python3.6/site-packages/skrules/skope_rules.py", line 600, in _eval_rule_perf
    detected_index = list(X.query(rule).index)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/frame.py", line 2297, in query
    res = self.eval(expr, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/frame.py", line 2366, in eval
    return _eval(expr, inplace=inplace, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/computation/eval.py", line 290, in eval
    truediv=truediv)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/computation/expr.py", line 732, in __init__
    self.terms = self.parse()
  File "/usr/local/lib/python3.6/site-packages/pandas/core/computation/expr.py", line 749, in parse
    return self._visitor.visit(self.expr)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/computation/expr.py", line 310, in visit
    node = ast.fix_missing_locations(ast.parse(clean))
  File "/usr/local/Cellar/python3/3.6.4/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ast.py", line 35, in parse
    return compile(source, filename, mode, PyCF_ONLY_AST)
  File "<unknown>", line 1
    petal length (cm )<=2.5999999046325684

Skope Rules should accept any kind of feature name. It means we have to transform feature name for computation and transforming it back at the end.

datajms commented 6 years ago

Sure. Can't wait for your PR fixing this issue.

floriangardin commented 6 years ago

I did a pull request for this issue here : #4

floriangardin commented 6 years ago

Pull request merged, now closing.