pegasystems / pega-datascientist-tools

Pega Data Scientist Tools
https://github.com/pegasystems/pega-datascientist-tools/wiki
Apache License 2.0
33 stars 24 forks source link

plotPredictorPerformance fails for AGB #215

Open yusufuyanik1 opened 3 months ago

yusufuyanik1 commented 3 months ago

pdstools version checks

Issue description

When I try to generate HealthCheck with a datamart that has only AGB configuration, plotPredictorPerformance function fails.

Reproducible example

from pdstools import ADMDatamart
import polars as pl

dm = ADMDatamart(
        path="." ,
        model_filename="",
        predictor_filename="",
        extract_keys=True,
        include_cols="pyFeatureImportance",
        query=pl.col("Configuration") == "AGB_Configuration"
    ).fillMissing()

dm.generateReport()

"IndexError: list index out of range"

Expected behavior

HealthCheck should be succesfully generated

Installed versions

---Version info--- pdstools: 3.4.3 Platform: macOS-14.4.1-arm64-arm-64bit Python: 3.12.3 | packaged by Anaconda, Inc. | (main, Apr 19 2024, 11:44:52) [Clang 14.0.6 ]

---Dependencies--- plotly: 5.22.0 requests: 2.31.0 pydot: 2.0.0 polars: 0.20.25 pyarrow: 16.0.0 tqdm: 4.66.4 pyyaml: aioboto3: 12.4.0

---Streamlit app dependencies--- streamlit: 1.34.0 quarto: 0.1.0 papermill: 2.6.0 itables: 2.0.1 pandas: 2.2.2 jinja2: 3.1.4 xlsxwriter: 3.2.0

StijnKas commented 2 months ago

Have you found a root cause for this @yusufuyanik1 ? What function causes this?

yusufuyanik1 commented 2 months ago

I couldn't reproduce the error with a Datamart containing only AGB configurations. However the plotPredictorPerformancereturns an empty chart because "pyPerformance" is not used in AGB configuration. I will change the function to use pyFeatureImportance for AGB configurations before closing this issue.