Closed adampingel closed 2 months ago
I would suggest zero-shot inference, combined with plotting the results and evaluating the performance.
Picking this up. I'll start with a recipe version of this Getting Started notebook.
@fayvor Can we start with: https://github.com/ibm-granite/granite-tsfm/blob/cookbook-dev/notebooks/recipes/energy_demand_forecasting/demand_forecast_zeroshot_recipe.ipynb
This use the preprocessor and forecasting pipeline, which I believe are bit easier to consume than the outputs of trainer.predict()
@fayvor Barebones, minimal notebook is here: https://github.com/ibm-granite/granite-tsfm/blob/cookbook-dev/notebooks/recipes/energy_demand_forecasting/demand_forecast_zeroshot_recipe_minimal.ipynb
That looks good, @wgifford. Integrating into my version now.
Hi @wgifford, the forecasting pipeline section is failing with this error. Any ideas?
pipeline = TimeSeriesForecastingPipeline(
zeroshot_model, timestamp_column=timestamp_column, target_columns=target_columns, explode_forecasts=True, freq="h"
)
zeroshot_forecast = pipeline(data)
zeroshot_forecast.head()
---
{
"name": "TypeError",
"message": "You have to supply one of 'by' and 'level'",
"stack": "---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/var/folders/nc/jrql4k0n2j73h7xktzxdr4pr0000gn/T/ipykernel_6860/1549610639.py in ?()
1 pipeline = TimeSeriesForecastingPipeline(
2 zeroshot_model, timestamp_column=timestamp_column, target_columns=target_columns, explode_forecasts=True, freq=\"h\"
3 )
----> 4 zeroshot_forecast = pipeline(data)
5 zeroshot_forecast.head()
~/Dev/granite-tsfm/tsfm_public/toolkit/time_series_forecasting_pipeline.py in ?(self, time_series, **kwargs)
320 all the values over the prediction horizon.
321
322 \"\"\"
323
--> 324 return super().__call__(time_series, **kwargs)
~/Dev/granite-tsfm/.venv/lib/python3.12/site-packages/transformers/pipelines/base.py in ?(self, inputs, num_workers, batch_size, *args, **kwargs)
1253 )
1254 )
1255 )
1256 else:
-> 1257 return self.run_single(inputs, preprocess_params, forward_params, postprocess_params)
~/Dev/granite-tsfm/tsfm_public/toolkit/time_series_forecasting_pipeline.py in ?(self, inputs, preprocess_params, forward_params, postprocess_params)
50 Returns:
51 _type_: _description_
52 \"\"\"
53 # our preprocess returns a dataset
---> 54 dataset = self.preprocess(inputs, **preprocess_params)
55
56 batch_size = forward_params[\"batch_size\"]
57 num_workers = forward_params[\"num_workers\"]
~/Dev/granite-tsfm/tsfm_public/toolkit/time_series_forecasting_pipeline.py in ?(self, time_series, **kwargs)
372
373 time_series = pd.concat((time_series, future_time_series), axis=0)
374 else:
375 # no additional exogenous data provided, extend with empty periods
--> 376 time_series = extend_time_series(
377 time_series=time_series,
378 timestamp_column=timestamp_column,
379 grouping_columns=id_columns,
~/Dev/granite-tsfm/tsfm_public/toolkit/time_series_preprocessor.py in ?(time_series, timestamp_column, grouping_columns, freq, periods)
1004
1005 if grouping_columns == []:
1006 new_time_series = augment_one_series(time_series)
1007 else:
-> 1008 new_time_series = time_series.groupby(grouping_columns).apply(augment_one_series, include_groups=False)
1009 idx_names = list(new_time_series.index.names)
1010 idx_names[-1] = \"__delete\"
1011 new_time_series = new_time_series.reset_index(names=idx_names)
~/Dev/granite-tsfm/.venv/lib/python3.12/site-packages/pandas/core/frame.py in ?(self, by, axis, level, as_index, sort, group_keys, observed, dropna)
9177
9178 from pandas.core.groupby.generic import DataFrameGroupBy
9179
9180 if level is None and by is None:
-> 9181 raise TypeError(\"You have to supply one of 'by' and 'level'\")
9182
9183 return DataFrameGroupBy(
9184 obj=self,
TypeError: You have to supply one of 'by' and 'level'"
}
can you try explicitly passing id_columns=[]
in TimeSeriesForecastingPipeline()
?
That seems to work, thx.
In prior versions, the setting of default values in the pipeline was problematic. I will confirm that they are fixed in the latest main
.
@wgifford I think there are two (three?) remaining steps. If the PR looks good otherwise, we could merge and then do these:
Can we look at if the plotting can be integrated with the existing plotting function?
I will confirm that for this notebook we can pin to the current version of main.
We don't currently have PyPi set up for granite-tsfm -- is there some general guidance (cc @adampingel)
Can we look at if the plotting can be integrated with the existing plotting function?
Yes, I'll work on a PR against plot_predictions on the cookbook-dev branch: https://github.com/ibm-granite/granite-tsfm/blob/cookbook-dev/tsfm_public/toolkit/visualization.py#L207
Can we look at if the plotting can be integrated with the existing plotting function? Yes, I'll work on a PR against plot_predictions on the cookbook-dev branch: https://github.com/ibm-granite/granite-tsfm/blob/cookbook-dev/tsfm_public/toolkit/visualization.py#L207
Thanks! Please PR against main
for this one. ~I believe there are 1 or 2 fixes that didn't make it to cookbook-dev yet.~
Actually, I am working some minor fixes -- one pertains to plot_predictions
(num_plots) here: https://github.com/ibm-granite/granite-tsfm/pull/124 the rest are around datetime handling with timezones
Please PR against main for this one.
Ok. Does that mean I can now point to main from the recipe as well?
Yes, please try it.
@wgifford can you give me access to push a branch to granite-tsfm? If not, I'll do a fork and PR.
Invite sent
@adampingel Can we close?
Zero-shot inference based on @wgifford's TTM Energy Demand Forecasting notebook.
NOTE: this recipe should go into the Granite Timeseries Cookbook.