skrub-data / skrub

Prepping tables for machine learning
https://skrub-data.org/
BSD 3-Clause "New" or "Revised" License
1.23k stars 98 forks source link

TimeDelta64DType treated as category by the TableReport #1132

Closed Vincent-Maladiere closed 1 day ago

Vincent-Maladiere commented 3 weeks ago

Describe the bug

This is somewhat niche, but TableReport treats TimeDelta64DType as discrete categories, while it's a continuous quantity as DateTime64DType.

Steps/Code to Reproduce

import pandas as pd
from skrub import TableReport

TableReport(
    pd.to_timedelta([20, 40], unit="D").to_frame()
)

Expected Results

A histogram distribution

Actual Results

A value counts distribution

Versions

dev
jeromedockes commented 2 weeks ago

I guess we need to special case timedeltas, inspect the range of values to decide on a time resolution and cast to float to plot a histogram

Vincent-Maladiere commented 2 weeks ago

Right, I thought pyplot would do this out of the box but I looks more involved indeed