functime-org / functime

Time-series machine learning at scale. Built with Polars for embarrassingly parallel feature extraction and forecasts on panel data.
https://docs.functime.ai
Apache License 2.0
1.02k stars 55 forks source link

[FEAT] [PLOTTING] add `plotting.plot_entities` to display info about entities #83

Closed baggiponte closed 11 months ago

baggiponte commented 11 months ago

I was wondering if we want a plot_entities histogram/barchart to display summary information about the entities, e.g. the number of observations for each entity.

This could be used to drop entities with too few obs, or (with future features) draw the number of missing values/zeroes in each series.

Here is a draft implementation:

import polars as pl
import plotly.express as px

url = "https://github.com/neocortexdb/functime/raw/main/data/commodities.parquet"
y = pl.scan_parquet(url).with_columns(pl.col("time").cast(pl.Date))

entity_col, time_col, target_col = y.columns

def plot_entities(y, **kwargs):
    # add logic to handle dataframe or lazyframes
    counts = (
        y
        .group_by(entity_col)
        .agg(pl.count())
        .collect()
    )

    height = len(counts) * 15 # sensible-ish default

    return (
        px.bar(
            data_frame = counts,
            x="count",
            y=entity_col,
            orientation="h",
            )
        .update_layout(height=height, **kwargs) # add logic to avoid `height` being passed twice
    )

plot_entities(y)
topher-lo commented 11 months ago

Closed by #98