Open finnschwall opened 2 years ago
@finnschwall From the error description, looks like the error might be related to your dataset.
Have you tried the dataset with a standard sklearn model (e.g., LinearRegression
) and check if the code works. That will help debug whether it is due to the custom model or due to the data.
Also in your code, how is m
defined? I don't see it defined after line 2.
From a quick glance, the structure of CustomModel' looks okay, but there might be a typo in
get_num_output_nodes: why does it return
.data`, just needs to return the number of nodes.
@finnschwall, the model that you are wrapping is still a pytorch model. So the genetic method may not apply to it. As Amit mentioned could you paste the line of code on how you instantiate the model?
Regards,
@amit-sharma Thanks for the quick reply. Testing with a random forest classifier indeed revealed an error with my data. But I still don't have any success. Now the prediction works with the rfc but still not with my model. It says it is missing a "prediction function". Is there a way to do it without having a prediction function in the underlying model just with the get_output function? Here the relevant code
import ModelClass
m = ModelClass.CustomModel(model, backend={"model": "ModelClass.CustomModel", "explainer": "dice_genetic.DiceGenetic" })
d = dice_ml.Data(dataframe=dataset, continuous_features=["city_development_index",
"experience","company_size","last_new_job","training_hours"], outcome_name='target')
exp_genetic = dice_ml.Dice(d, m, method='genetic')
dice_exp_genetic = exp_genetic.generate_counterfactuals(query_instances, total_CFs=1, desired_class="opposite")
and the error
AttributeError Traceback (most recent call last)
/tmp/ipykernel_4480/419216923.py in <module>
2 "experience","company_size","last_new_job","training_hours"], outcome_name='target')
3 exp_genetic = dice_ml.Dice(d, m, method='genetic')
----> 4 dice_exp_genetic = exp_genetic.generate_counterfactuals(query_instances, total_CFs=1, desired_class="opposite")
5 dice_exp_genetic.visualize_as_dataframe(show_only_changes=True)
....
~/.local/lib/python3.9/site-packages/torch/nn/modules/module.py in __getattr__(self, name)
1128 if name in modules:
1129 return modules[name]
-> 1130 raise AttributeError("'{}' object has no attribute '{}'".format(
1131 type(self).__name__, name))
1132
AttributeError: 'NeuralNetwork' object has no attribute 'predict'
Additionally I found something which I find weird behavior. If I use the provided pytorch-interface with
m = dice_ml.Model(model=model, backend="PYT")
it does not throw any errors but also doesn't find any cfs. But for the same dataset and same configuration there are cfs with the rf classifier.
@finnschwall you are almost there! If I understand correctly, you just need to replace the below line,
m = dice_ml.Model(model=model, backend="sklearn")
with a custom Model
that extends BaseModel
.
m = dice_ml.MyModel(model=pytorchmodel, backend="sklearn")
and then replace the get_output
method. I see that you had already done that in your first post. That should have worked.
Can you share the full stack trace of the error?
On your second question: Using backend="PYT"
does two things. First, it uses the in-built pytorch model interface, and second, it defaults to a gradient-based algorithm for CFs, since that is more suited for models with gradients. So if you specify backend
as PYT, then DICE defaults to the gradient-based method and ignores the genetic
method. In general, genetic
tries to approximate the gradient so it is best to use the gradients if they are available. But then perhaps it makes sense to allow genetic
algorithm too for full flexibility. In any case, since CFs depend on both the trained model and data, it is possible that the algorithm provided CFs for RFC but not for pytorch model. That usually means that you may need to change hyperparameters for the algorithm, as in this notebook.
Thanks for the clarification on the backend parameter. Here the full error stacktrace:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
/tmp/ipykernel_3114/1246876773.py in <module>
8 exp_genetic = dice_ml.Dice(d, m_own, method='genetic')
9
---> 10 dice_exp_genetic = exp_genetic.generate_counterfactuals(query_instances, total_CFs=5, desired_class="opposite")
11 dice_exp_genetic.visualize_as_dataframe(show_only_changes=True)
~/.local/lib/python3.9/site-packages/dice_ml/explainer_interfaces/explainer_base.py in generate_counterfactuals(self, query_instances, total_CFs, desired_class, desired_range, permitted_range, features_to_vary, stopping_threshold, posthoc_sparsity_param, posthoc_sparsity_algorithm, verbose, **kwargs)
90 for query_instance in tqdm(query_instances_list):
91 self.data_interface.set_continuous_feature_indexes(query_instance)
---> 92 res = self._generate_counterfactuals(
93 query_instance, total_CFs,
94 desired_class=desired_class,
~/.local/lib/python3.9/site-packages/dice_ml/explainer_interfaces/dice_genetic.py in _generate_counterfactuals(self, query_instance, total_CFs, initialization, desired_range, desired_class, proximity_weight, sparsity_weight, diversity_weight, categorical_penalty, algorithm, features_to_vary, permitted_range, yloss_type, diversity_loss_type, feature_weights, stopping_threshold, posthoc_sparsity_param, posthoc_sparsity_algorithm, maxiterations, thresh, verbose)
284 query_instance_df_dummies[col] = 0
285
--> 286 self.do_param_initializations(total_CFs, initialization, desired_range, desired_class, query_instance,
287 query_instance_df_dummies, algorithm, features_to_vary, permitted_range,
288 yloss_type, diversity_loss_type, feature_weights, proximity_weight,
~/.local/lib/python3.9/site-packages/dice_ml/explainer_interfaces/dice_genetic.py in do_param_initializations(self, total_CFs, initialization, desired_range, desired_class, query_instance, query_instance_df_dummies, algorithm, features_to_vary, permitted_range, yloss_type, diversity_loss_type, feature_weights, proximity_weight, sparsity_weight, diversity_weight, categorical_penalty, verbose)
204 self.feature_range = self.get_valid_feature_range(normalized=False)
205 if len(self.cfs) != total_CFs:
--> 206 self.do_cf_initializations(
207 total_CFs, initialization, algorithm, features_to_vary, desired_range, desired_class,
208 query_instance, query_instance_df_dummies, verbose)
~/.local/lib/python3.9/site-packages/dice_ml/explainer_interfaces/dice_genetic.py in do_cf_initializations(self, total_CFs, initialization, algorithm, features_to_vary, desired_range, desired_class, query_instance, query_instance_df_dummies, verbose)
180 # Partitioned dataset and KD Tree for each class (binary) of the dataset
181 self.dataset_with_predictions, self.KD_tree, self.predictions = \
--> 182 self.build_KD_tree(self.data_interface.data_df.copy(),
183 desired_range, desired_class, self.predicted_outcome_name)
184 if self.KD_tree is None:
~/.local/lib/python3.9/site-packages/dice_ml/explainer_interfaces/explainer_base.py in build_KD_tree(self, data_df_copy, desired_range, desired_class, predicted_outcome_name)
657 query_instance=data_df_copy[self.data_interface.feature_names])
658
--> 659 predictions = self.model.model.predict(dataset_instance)
660 # TODO: Is it okay to insert a column in the original dataframe with the predicted outcome? This is memory-efficient
661 data_df_copy[predicted_outcome_name] = predictions
~/.local/lib/python3.9/site-packages/torch/nn/modules/module.py in __getattr__(self, name)
1128 if name in modules:
1129 return modules[name]
-> 1130 raise AttributeError("'{}' object has no attribute '{}'".format(
1131 type(self).__name__, name))
1132
AttributeError: 'NeuralNetwork' object has no attribute 'predict'
Hello, I am trying to use DiCE for truly model-independent counterfactual generation. But I am unable to create my own model interface. Could you add a tutorial or give a short description how to do this? Or help me with my implementation?
I tried writing a model-interface which acts a wrapper for pytorch since I already know this model theoretically works. The code is mostly copied from the pytorch interface. If desired I can upload my entire code.
But executing my code with this model gives
.....