pegasystems / pega-datascientist-tools

Pega Data Scientist Tools
https://github.com/pegasystems/pega-datascientist-tools/wiki
Apache License 2.0
33 stars 24 forks source link

plotOverTime errors when too many Configuration facets #204

Open operdeck opened 5 months ago

operdeck commented 5 months ago

pdstools version checks

Issue description

In the health check we have a few calls to plot over time with a Configuration facet. If there are many configs, the facetting breaks. For now try/catched around this but should solve properly.

First is a warning, next a ValueError

Warning: plotting this much data (969 rows) will probably be slow while not providing many insights. Consider filtering the data by either limiting the number of models, filtering on SnapshotTime or facetting.

ValueError Vertical spacing cannot be greater than (1 / (rows - 1)) = 0.066667. The resulting plot would have 16 rows (rows=16). Use the facet_row_spacing argument to adjust this spacing.

Possibly too many facets: 46.

Reproducible example

try:
    fig = datamart.plotOverTime(
        # TODO: the faceting errors out when there are many configurations
        "weighted_performance", by="Channel/Direction", facets=facet, facet_col_wrap=facet_col_wrap
    )
    fig = (
        fig.update_layout(autosize=True, height=height, title="Trend of Model Performance")
        .for_each_annotation(lambda a: a.update(text=a.text.replace(f"{facet}=", "")))
        .update_yaxes(showticklabels=True, title="")
        .update_xaxes(title="")
    )

    fig.show()
except ValueError as e:
    print(f"Error {str(e)}\nPossibly too many facets: {unique_count}.")

Expected behavior

No errors

Installed versions

Detailed version info for pdstools:

---Version info--- pdstools: 3.3.0 Platform: macOS-10.16-x86_64-i386-64bit Python: 3.11.5 (main, Sep 11 2023, 08:19:27) [Clang 14.0.6 ]

---Dependencies--- plotly: 5.17.0 requests: 2.31.0 pydot: 1.4.2 polars: 0.20.2 pyarrow: 13.0.0 tqdm: 4.66.1 pyyaml: aioboto3: 11.3.0

---Streamlit app dependencies--- streamlit: 1.31.0 quarto: papermill: 2.4.0 itables: 1.6.1 pandas: 2.2.1 jinja2: 3.1.3 xlsxwriter: 3.1.9

StijnKas commented 1 month ago

How should we handle this you think @operdeck, use the same ValueError handling you're doing in the given example in the core code itself?

StijnKas commented 1 week ago

Suggestion: have an on_error argument with options 'skip', 'warn', 'raise'.