Closed du-phan closed 3 years ago
Did not look too much into the code yet, but some preleminary notes:
For the scaling issue with numerical features, I guess we could try to reuse doctor's code, as I did for error analysis: https://github.com/dataiku/dku-contrib-private/blob/hackaiku2020/mldebugging/error-analysis/error-analysis/python-lib/dku_error_tree_parsing/depreprocessor.py
Also we may want to have the model conversion to be done directly from the webapp, as a third mode to the create/load first page. In which case we may want to directly convert to an IDTB tree (without first creating a JSON file)?
Also I already told you this but for the record, you do not need to convert node ids as they do not have much meaning (apart from linking parent to children) 😄
Should we reopen a PR for this basing ourselves on what is done in MEA? 🙂
This PR adds a recipe to parse a deployed doctor decision tree to a json file that can be read by the plugin's webapp (and thus be modified afterward) .
Notes:
clf.tree_.value
returns the weigthed number of samples of each class in each node, thus they are not integer, but float. We round them to the nearest integer.