scikit-learn-contrib / skope-rules

machine learning with logical rules in Python
http://skope-rules.readthedocs.io
Other
599 stars 96 forks source link

Newfeatures #42

Closed arplas closed 3 weeks ago

arplas commented 3 years ago

Hi !

I've added some new features for skope rules :

1) new filtering and deduplication criteria

deduplication_criterion: str, optional (default='f1')
    The criterion to be used for deduplicating the rules.
    Either 'f1', 'mcc' or 'myfunc'.

"myfunc" being another new SkopeRule class parameter :

myfunc: FunctionType, optional (default=None)
    A personalised function that can be used as either/both a filtering
    or/and deduplication criterion.
    Has to take 4 parameters which are supposed to be the confusion matrix
    elements tn, fp, fn, tp (in that order).

2) Rounding of rule features values

Approximate the values of the features of a rule
according to a dict containing the features to round.

Arguments:
    rule: Rule
        Tuple containing the rule string and confusion matrix

    features_to_round: dict
        A dict containing the names of the quantitative features
        whose values are to be rounded.
        Should be in the form {var_name: power_of_ten_exponent}
        with types {str: int/float}.
        The power_of_ten_exponent should be an int
        (if float then rounded) either positive or negative.
        It is the exponent of the power of ten at which the value
        of the feature is rounded.

        example:
            - power_of_ten_exponent = 1 => 1357.914 becomes 1.36E+3
            - power_of_ten_exponent = 0 => 1357.914 becomes 1358
            - power_of_ten_exponent = -2 => 1357.914 becomes 1357.91

Returns:
    rounded rule: str
        A string representation of the rule with rounded features values.

Some issues that have been raised (but no merged) are also answered, such as :

ngoix commented 3 years ago

thanks for the contribution, all of this look very exciting and very useful

could you please split this PR into different ones, so that each of them addresses a specific enhancement or fix?