Predictions bug - Githubissues

imoscovitz / wittgenstein

Ruleset covering algorithms for transparent machine learning

MIT License

90 stars 24 forks source link

Predictions bug #22

Open Schichael opened 2 years ago

Schichael commented 2 years ago

Hello @imoscovitz, I am running into problems when making predictions as I am getting wrong predictions. See the following simple example. All predictions should be True according to the rule but only the first prediction is correct. This seems like a bug. Or am I doing something wrong? I am using version 0.3.2.

Schichael commented 2 years ago

I looked into the code and it seems like the problem is caused by dropping duplicate rows in base.py, line 128: covered = covered.drop_duplicaes(). This way only the first row remains. The removed samples are the predicted as False. By removing the line I got the expected results.

Schichael commented 2 years ago

Replacing the original code with the following should fix the problem. I didn't test for execution time.

covered = self.rules[0].covers(df).copy()
for rule in self.rules[1:]:
    rule_df = rule.covers(df)
    # Drop rows with indices from rule_df that are already in covered
    rule_df = rule_df.drop(set.intersection(set(covered.index), set(rule_df.index)))
    covered = pd.concat([covered, rule_df])
return covered

gujingit commented 2 years ago

Replacing the original code with the following should fix the problem. I didn't test for execution time.

covered = self.rules[0].covers(df).copy()
for rule in self.rules[1:]:
    rule_df = rule.covers(df)
    # Drop rows with indices from rule_df that are already in covered
    rule_df = rule_df.drop(set.intersection(set(covered.index), set(rule_df.index)))
    covered = pd.concat([covered, rule_df])
return covered

Same issue. This method can solve the problem.