I was wondering if we want a plot_entities histogram/barchart to display summary information about the entities, e.g. the number of observations for each entity.
This could be used to drop entities with too few obs, or (with future features) draw the number of missing values/zeroes in each series.
Here is a draft implementation:
import polars as pl
import plotly.express as px
url = "https://github.com/neocortexdb/functime/raw/main/data/commodities.parquet"
y = pl.scan_parquet(url).with_columns(pl.col("time").cast(pl.Date))
entity_col, time_col, target_col = y.columns
def plot_entities(y, **kwargs):
# add logic to handle dataframe or lazyframes
counts = (
y
.group_by(entity_col)
.agg(pl.count())
.collect()
)
height = len(counts) * 15 # sensible-ish default
return (
px.bar(
data_frame = counts,
x="count",
y=entity_col,
orientation="h",
)
.update_layout(height=height, **kwargs) # add logic to avoid `height` being passed twice
)
plot_entities(y)
I was wondering if we want a
plot_entities
histogram/barchart to display summary information about the entities, e.g. the number of observations for each entity.This could be used to drop entities with too few obs, or (with future features) draw the number of missing values/zeroes in each series.
Here is a draft implementation: