Open Schichael opened 2 years ago
I looked into the code and it seems like the problem is caused by dropping duplicate rows in base.py, line 128: covered = covered.drop_duplicaes(). This way only the first row remains. The removed samples are the predicted as False. By removing the line I got the expected results.
Replacing the original code with the following should fix the problem. I didn't test for execution time.
covered = self.rules[0].covers(df).copy()
for rule in self.rules[1:]:
rule_df = rule.covers(df)
# Drop rows with indices from rule_df that are already in covered
rule_df = rule_df.drop(set.intersection(set(covered.index), set(rule_df.index)))
covered = pd.concat([covered, rule_df])
return covered
Replacing the original code with the following should fix the problem. I didn't test for execution time.
covered = self.rules[0].covers(df).copy() for rule in self.rules[1:]: rule_df = rule.covers(df) # Drop rows with indices from rule_df that are already in covered rule_df = rule_df.drop(set.intersection(set(covered.index), set(rule_df.index))) covered = pd.concat([covered, rule_df]) return covered
Same issue. This method can solve the problem.
Hello @imoscovitz, I am running into problems when making predictions as I am getting wrong predictions. See the following simple example. All predictions should be True according to the rule but only the first prediction is correct. This seems like a bug. Or am I doing something wrong? I am using version 0.3.2.