[FEAT] [evaluation] Add rank_by to evaluation

baggiponte commented 10 months ago

We mention and use the coefficient of variation more than once, such as here. It would be interesting to have a evaluation.rank_cv function to see what entities in a panel display the greatest variation.

The way I see it, we should have a public method (perhaps even in feature_extraction?) to compute the CV across all entities. This would be used by rank_cv and possibly in plot_entities (see #83) to display additional information about all entities in the panel.

topher-lo commented 10 months ago

Agreed. This is so commonly used in industry, especially supply chain. We should make one standalone.

baggiponte commented 8 months ago

Update: since we have an amazing set of feature extractors, we can add a rank_by(y, extractor, order) function that does this:

def rank_by(y: pl.LazyFrame | pl.DataFrame, extractor: str, order: Literal["worst", "best"], n_series: int):
    if isinstance(y, pl.DataFrame):
         y = y.lazy()

    function = <getattr magic with extractor and pl.ts namespace>

    results = (
        y.group_by(entity)
        .agg(target.ts.function.alias(extractor))
    )

    if oder == "best":
        return  results.top_k(k=n_series, by=extractor)
    return results.bottom_k(k=n_series, by=extractor)

this can be used with plotting.plot_panel to generate great EDA.

functime-org / functime

[FEAT] [evaluation] Add rank_by to evaluation #84