Open Berlyli866 opened 1 month ago
Hi @Berlyli866,
From what I can see in the code snippet above, you are not really following the conventions presented in the documentation. Please also have a look at the docs here, besides the Adult Census example we provide.
Some issues that I can see in the code above are:
ranges
- are incorrectly specified.
An example of correctly specified feature in your case would be:
ranges = {
'num_image': [0.0, 1.0],
'num_alternative_image': [-1.0, 0.0],
'num_market_bullets': [-1.0, 1.0]
}
This means that when generating the counterfactual for a given instance num_image
can only increase, num_alternative_image
can only decrease, num_market_bulltets
can increase or decrease (you can actually omit specifying the feature which can increase or decrease).
feature_names
- should be a list of strings containing the column names of your data frame. The order of the features within feature_names
is important. They should match exactly the order of your columns.
immutable_features
- should be a list of strings and contain column names that are present in the feature_names
list. I can't tell exactly what you passed, but I think it is worth mentioning.
C
- should be a list of dictionaries with strings as keys, where each key is a name of a column. From what I see, you are providing column indices as keys. Note that for numerical features, the conditioning represents a delta change from the original value and 0 must be included in the interval you specified. For example C=[{"num_image": [0, 5]}]
means that num_image
value will be changed by a maximum of 5 units.
You don't necessarily need to train an autoencoder. You can see an example here how to do that. Please read the final paragraph in this comment.
You should provide a complete description of category_map
. So, yes, binary features should be specified as well.
Although we didn't test explicitly for regression models, I think the implementation still supports it for a single target. It is really up to you how to design the reward function. You can give a sparse or continuous reward based on the distance from your target. The reward function can be specified through the parameter reward_func
(see docs here)
Hi Robert, Thanks for your response and help here !
For C
if i use the string as keys, i will have error
``
but if i use the indices key, it will run through. In my code the C
is
index map :
26 num_image
27 num_alternative_image
28 num_market_bullets
2 salient_bullet
12 image_life_style
39 return_policy_90-Day
C = [{26: [4, 17],27:[0,6], 28: [5, 22]}]
which means I want the num_image
changes from 4 to 17 , num_alternative_image
changes from 0 to 6, and num_market_bullets
change from 5 to 22, sailent_bullets
can only be 1 , image_life_style
can only be 1, return_policy_90-Day
can only be 1
I actually have other features in C
but the output after i ran explanation = explainer.explain(X, Y_t, C)
i did not see any change for the feature I specify in C
and not in the immutable_features
list. Do you know what the reason may be?
below are my change features and their range :
change_features=['num_image',
'num_alternative_image',
'num_market_bullets','image_life_style','return_policy_90-Day','salient_bullet']
ranges = {'image_life_style'::[0.0, 1.0]
'return_policy_90-Day':[0.0, 1.0]
'salient_bullet':[0.0, 1.0]
}
image_life_style, return_policy_90-Day,salient_bullet
are all binary features. I want them to change only from 0 to 1. If it's 1, don't change it. Do I specify the range right? For other changeable features, they can increase or decrease, so I omit them here.
the category_map=cate_map
in my code is all binary features index and their value [0,1] so something like :
{0: [0, 1],
2: [1, 0],
6: [1, 0],
7: [0, 1]}
0,2,6,7 is the feature index. Now that I know if they are binary, we also need to specify. thanks.
Hi team, First of all, thanks to the team for working on building such a good package for us to use.
I follow the example Counterfactual with Reinforcement Learning (CFRL) on Adult Census to build my own CFL.
I have a data set that is a mix of numerical, binary, and category features. I trained a random forest classification model as the predictor model and ran
counterfactualtabluer
to generate the counterfactual for features that I am interested in. Below is part of the code on how i specify the candidate features and immutable featureafter I get the counterfactual df I compared it with original df and got the difference columns below. The avg_delivery_days is immutable but also changes though very tiny change, for 'num_image', 'num_alternative_image' , 'num_market_bullets' the change is also minimal. Can I see the changed features play an important role in predicting the label (>0.4 or <=0.4) since a small change and flip the label ? Did i use the right counterfactual function for my use case? :
For tabluar data , do i always need encoder and decoder? if its already binary should i put binary feature in category_map in below function ?
heae_preprocessor, heae_inv_preprocessor = get_he_preprocessor(X=X_train, feature_names=model_attr, category_map=cate_map, feature_types=feature_types)
Another question I have is what function I can use for the environment models, such as boost regression or a regression type of black box model? If I tried to use
but replace predictor as the boost regression model. What other changes do I need to make since the regression model, the prediction is continuous, how can i customize the reward function?
sorry for all these questions, as i am a starter in RL and is still learning everthing so forgive me if my questions sounds dump.
thanks for your time and help