applies some minor refactoring on model configuration for model small
moves method evaluate from __main__.py to module evaluate.py
changes the names of model run folders to that model name is used, for easier retrieval
introduces new CLI option save_model to save each component of the model pipeline as pickle into model_comp{comp_id}.pickle
introduces CLI option predict_siret_list to specify a filepath containing a list of SIRETs. Dataset predictis then restricted to SIRETs in this list
introduces CLI option explain to specify whether the fields containing the explanations of model predictions should be computed. This option has a store_true behaviour, such that by default, explanations fields are not computed.
Moreover, the PR introduces tools to select model thresholds from predicted failure risk, allowing to build lists of selected companies using a few intuitive functions:
introduces methods to select thresholds from various criteria in evaluate.py
introduces notebook 03-select-thresholds.ipynb to leverage these new functions
introduces notebook 04-merge_models_into_list.ipynb to use the selected thresholds to produce a list
On the refactoring side, this PR:
small
evaluate
from__main__.py
to moduleevaluate.py
save_model
to save each component of the model pipeline as pickle intomodel_comp{comp_id}.pickle
predict_siret_list
to specify a filepath containing a list of SIRETs. Datasetpredict
is then restricted to SIRETs in this listexplain
to specify whether the fields containing the explanations of model predictions should be computed. This option has a store_true behaviour, such that by default, explanations fields are not computed.Moreover, the PR introduces tools to select model thresholds from predicted failure risk, allowing to build lists of selected companies using a few intuitive functions:
evaluate.py
03-select-thresholds.ipynb
to leverage these new functions04-merge_models_into_list.ipynb
to use the selected thresholds to produce a list