pegasystems / pega-datascientist-tools

Pega Data Scientist Tools
https://github.com/pegasystems/pega-datascientist-tools/wiki
Apache License 2.0
33 stars 27 forks source link

Datamart reports and utilities to work with new AGB models #48

Open operdeck opened 2 years ago

operdeck commented 2 years ago

The new AGB variant of Adaptive has some differences wrt AB in what it stores in the predictor binning tables. One of the changes is there is no predictor binning like NB has, instead the bins are used to just list the possible values/ranges.

When currently running the standard Health check notebook on AGB data, the predictor performance plot shows odd values in the x-axis. The aggregate view of the predictors currently does not work.

The off-line model reports seem to show the classifier correctly, but need to verify the details. The predictors list looks off - strange values. The individual predictor binning is completely wrong/empty (as expected, but we need to provide an alternative view here for AGB).

StijnKas commented 2 years ago

55 adds support for analyzing the json export of the models in Python - but this issue is still relevant, as we need to update the Python tools to support AGB datamart (model & predictor data) still.

StijnKas commented 2 years ago

57 introduced further improvements in this area: can now extract AGB models directly from the modeldata column in the datamart, and use this to analyse multiple trees at once. This is still not finished by any means, so still very much open to any feedback.

StijnKas commented 1 year ago

Let's re-evaluate this with the new Python healthcheck. @operdeck @yusufuyanik1