probabl-ai / skore

Skore lets you "Own Your Data Science." It provides a user-friendly interface to track and visualize your modeling results, and perform evaluation of your machine learning models with scikit-learn.
https://probabl-ai.github.io/skore/
MIT License
76 stars 7 forks source link

BUG: putting a pandas dataframe with nan #805

Open sylvaincom opened 3 days ago

sylvaincom commented 3 days ago

Note: I put this as P1, but we might consider P0

Describe the bug

Thanks @jeromedockes for the heads up

Currently, skore does not support the put of pandas dataframes with nan values.

Steps/Code to Reproduce

import skore
import pandas as pd
from skrub.datasets import fetch_employee_salaries

my_project = skore.create("my_project", overwrite=True)

dataset = fetch_employee_salaries()
employees_df, salaries = dataset.X, dataset.y
display(employees_df.isna().mean()) # has missing values

my_project.put("my_df", employees_df)

returns the following error:

ValueError: Out of range float values are not JSON compliant: nan

Expected Behavior

No error

Actual Behavior

Got error

Environment

System:
    python: 3.12.4 | packaged by Anaconda, Inc. | (main, Jun 18 2024, 10:07:17) [Clang 14.0.6 ]

Python dependencies:
        skore: 0.4.0
          pip: 24.3.1
   setuptools: None
    diskcache: 5.6.3
      fastapi: 0.115.2
 plotly<6,>=5: None
         rich: 13.9.2
        skops: 0.10.0
      uvicorn: 0.32.0
augustebaum commented 1 day ago

Thanks for this; I do think this is high priority (and likely not too hard to fix)