Feat/explain predictions

slebastard commented 3 years ago

Added a function explain(sf_dataset: SFDataset, conf: ModuleType) that explains the relative contribution of each input feature to the logistic regression declared in conf.MODEL_PIPELINE. The function takes as input a SFDataset containing the dataset predictions were ran on, and outputs a SFDataset with two additional columns:

a column "micro_expl", containing a dictionary with the relative contribution of each individual feature to the failure risk score
a columns "macro_expl", containing a dictionary with the relative contribution of each group of features, declared in conf.FEATURE_GROUPS, to the failure risk score

Note: in the future, adding a test of explain() in our e2e testing may be interesting:

testing that all elements of conf.FEATURES are reprensented in dictionary "micro_expl"
testing that all feature groups of conf.FEATURE_GROUPS are represented in "macro_expl"
testing that the explanation weights are between -1 and 1

slebastard commented 3 years ago

May 11th, 2021 updates:

Moved function explain() to new predictsignauxfaibles.explainabilitymodule
In predictsignauxfaibles.explainability, added column macro_radar containing scores between 0 (very bad situation, high risk) and 1 (very good situation, low risk) to interface a radar plot in the front-end
In predictsignauxfaibles.explainability, added columns expl_selection containing two lists: select_concerning (respectively select_reassuring) containing the list of variables that have a substantially positive/bad (resp. negative/good) contribution to the risk score. This selection was based on a nice statistical criterion on the maximal hypothetical contribution of a group of features to the global score
In models/default/model_conf.py and models/small/model_conf.py, fixed a bug on the definition of FEATURES from FEATURE_GROUPS @vviers could you please take one last look at the modifications above?

slebastard commented 3 years ago

Here is a theoretical ressource to understand how radar scores were computed: sf_logreg_explradar_theory

slebastard commented 3 years ago

Merci pour ta review! Je m'occupe du conflit sur utils.py

signaux-faibles / predictsignauxfaibles

Feat/explain predictions #61