Hello. 👋 I have no issues with the BRL classifier when I'm using datasets with all numeric or all categorical features. But when I use a dataset with both, I get the following error. The categorical feature in this dataset has already been one hot encoded and I'm passing those columns to the "undiscretized_features" parameter, but it looks like it's being encoded again anyway?
KeyError Traceback (most recent call last)
<ipython-input-13-a701282e6a2c> in <module>
1 cls = BayesianRuleListClassifier()
----> 2 cls.fit(X.values, Y, feature_names = X.columns, undiscretized_features = ["X1_N", "X1_Y"])
C:\ProgramData\Anaconda3\lib\site-packages\imodels\rule_list\bayesian_rule_list\bayesian_rule_list.py in fit(self, X, y, feature_names, undiscretized_features, verbose)
204 rule_strs = itemsets_to_rules(self.final_itemsets)
205 self.rules_without_feature_names_ = [Rule(r) for r in rule_strs]
--> 206 self.rules_ = [
207 replace_feature_name(rule, self.feature_dict_) for rule in self.rules_without_feature_names_
208 ]
C:\ProgramData\Anaconda3\lib\site-packages\imodels\rule_list\bayesian_rule_list\bayesian_rule_list.py in <listcomp>(.0)
205 self.rules_without_feature_names_ = [Rule(r) for r in rule_strs]
206 self.rules_ = [
--> 207 replace_feature_name(rule, self.feature_dict_) for rule in self.rules_without_feature_names_
208 ]
209
C:\ProgramData\Anaconda3\lib\site-packages\imodels\util\rule.py in replace_feature_name(rule, replace_dict)
74 replaced_agg_dict = {}
75 for feature, symbol in rule_replaced.agg_dict:
---> 76 replaced_agg_dict[(replace_dict[feature], symbol)] = rule_replaced.agg_dict[(feature, symbol)]
77 rule_replaced.agg_dict = replaced_agg_dict
78 return rule_replaced
KeyError: 'X_0_0.0'
Hello. 👋 I have no issues with the BRL classifier when I'm using datasets with all numeric or all categorical features. But when I use a dataset with both, I get the following error. The categorical feature in this dataset has already been one hot encoded and I'm passing those columns to the "undiscretized_features" parameter, but it looks like it's being encoded again anyway?