SeldonIO / alibi

Algorithms for explaining machine learning models
https://docs.seldon.io/projects/alibi/en/stable/
Other
2.42k stars 252 forks source link

immutable feature changes when using #1022

Open Berlyli866 opened 1 month ago

Berlyli866 commented 1 month ago

Hi team, First of all, thanks to the team for working on building such a good package for us to use.

I follow the example Counterfactual with Reinforcement Learning (CFRL) on Adult Census to build my own CFL.

I have a data set that is a mix of numerical, binary, and category features. I trained a random forest classification model as the predictor model and ran counterfactualtabluer to generate the counterfactual for features that I am interested in. Below is part of the code on how i specify the candidate features and immutable feature


ranges = {'num_image': [1, 16], 
         'num_alternative_image': [0,6],
          'num_market_bullets':[5,19]
         }

from alibi.explainers import CounterfactualRLTabular
explainer = CounterfactualRLTabular(predictor=predictor,
                                    encoder=heae.encoder,
                                    decoder=heae.decoder,
                                    latent_dim=LATENT_DIM,
                                    encoder_preprocessor=heae_preprocessor,
                                    decoder_inv_preprocessor=heae_inv_preprocessor,
                                    coeff_sparsity=COEFF_SPARSITY,
                                    coeff_consistency=COEFF_CONSISTENCY,
                                    category_map=cate_map,
                                    feature_names=model_attr,
                                    #ranges=ranges,
                                    immutable_features=immutable_features,
                                    train_steps=TRAIN_STEPS,
                                    batch_size=BATCH_SIZE,
                                    backend="tensorflow")

explainer = explainer.fit(X=X_train.to_numpy())

X_positive = X_test[np.argmax(predictor(X_test), axis=1) == 1]
X = X_positive[:1000]
Y_t = np.array([0])
#index 20 num_image, 21 num_alternative_image, 22 num_market_bullets. if i put feature name i will get error somehow. 
C = [{20: [1, 10],21:[0,6], 22: [5, 10]}]
explanation = explainer.explain(X, Y_t, C)

after I get the counterfactual df I compared it with original df and got the difference columns below. The avg_delivery_days is immutable but also changes though very tiny change, for 'num_image', 'num_alternative_image' , 'num_market_bullets' the change is also minimal. Can I see the changed features play an important role in predicting the label (>0.4 or <=0.4) since a small change and flip the label ? Did i use the right counterfactual function for my use case? :

Screenshot 2024-10-19 at 18 19 57

For tabluar data , do i always need encoder and decoder? if its already binary should i put binary feature in category_map in below function ?

heae_preprocessor, heae_inv_preprocessor = get_he_preprocessor(X=X_train, feature_names=model_attr, category_map=cate_map, feature_types=feature_types)

Another question I have is what function I can use for the environment models, such as boost regression or a regression type of black box model? If I tried to use

explainer = CounterfactualRLTabular(predictor=predictor,
                                    encoder=heae.encoder,
                                    decoder=heae.decoder,
                                    latent_dim=LATENT_DIM,
                                    encoder_preprocessor=heae_preprocessor,
                                    decoder_inv_preprocessor=heae_inv_preprocessor,
                                    coeff_sparsity=COEFF_SPARSITY,
                                    coeff_consistency=COEFF_CONSISTENCY,
                                    category_map=cate_map,
                                    feature_names=model_attr,
                                    #ranges=ranges,
                                    immutable_features=immutable_features,
                                    train_steps=TRAIN_STEPS,
                                    batch_size=BATCH_SIZE,
                                    backend="tensorflow")

but replace predictor as the boost regression model. What other changes do I need to make since the regression model, the prediction is continuous, how can i customize the reward function?

sorry for all these questions, as i am a starter in RL and is still learning everthing so forgive me if my questions sounds dump.

thanks for your time and help

RobertSamoilescu commented 1 month ago

Hi @Berlyli866,

From what I can see in the code snippet above, you are not really following the conventions presented in the documentation. Please also have a look at the docs here, besides the Adult Census example we provide.

Some issues that I can see in the code above are:

This means that when generating the counterfactual for a given instance num_image can only increase, num_alternative_image can only decrease, num_market_bulltets can increase or decrease (you can actually omit specifying the feature which can increase or decrease).

You don't necessarily need to train an autoencoder. You can see an example here how to do that. Please read the final paragraph in this comment.

You should provide a complete description of category_map. So, yes, binary features should be specified as well.

Although we didn't test explicitly for regression models, I think the implementation still supports it for a single target. It is really up to you how to design the reward function. You can give a sparse or continuous reward based on the distance from your target. The reward function can be specified through the parameter reward_func (see docs here)

Berlyli866 commented 1 month ago

Hi Robert, Thanks for your response and help here !

For C if i use the string as keys, i will have error ``

but if i use the indices key, it will run through. In my code the C is

index map : 
26 num_image
27 num_alternative_image
28 num_market_bullets
2 salient_bullet
12 image_life_style
39 return_policy_90-Day

C = [{26: [4, 17],27:[0,6], 28: [5, 22]}]

which means I want the num_image changes from 4 to 17 , num_alternative_image changes from 0 to 6, and num_market_bullets change from 5 to 22, sailent_bullets can only be 1 , image_life_stylecan only be 1, return_policy_90-Day can only be 1

I actually have other features in C but the output after i ran explanation = explainer.explain(X, Y_t, C) i did not see any change for the feature I specify in C and not in the immutable_features list. Do you know what the reason may be? below are my change features and their range :

change_features=['num_image',
 'num_alternative_image',
 'num_market_bullets','image_life_style','return_policy_90-Day','salient_bullet']

ranges = {'image_life_style'::[0.0, 1.0]
          'return_policy_90-Day':[0.0, 1.0]
          'salient_bullet':[0.0, 1.0]
         }

image_life_style, return_policy_90-Day,salient_bullet are all binary features. I want them to change only from 0 to 1. If it's 1, don't change it. Do I specify the range right? For other changeable features, they can increase or decrease, so I omit them here.

the category_map=cate_map in my code is all binary features index and their value [0,1] so something like :

{0: [0, 1],
 2: [1, 0],
 6: [1, 0],
 7: [0, 1]}

0,2,6,7 is the feature index. Now that I know if they are binary, we also need to specify. thanks.