huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
132.16k stars 26.33k forks source link

Add predict_proba method for Autoformer/Informer #29556

Open benHeid opened 6 months ago

benHeid commented 6 months ago

Feature request

Currently, the autoformer and informer does only provide sampled outputs from a probability distribution for future values. However, it would be nice if there would be the possibility to provide the forecasted distribution to the user.

Motivation

I am currently trying to develop an adapter in sktime that enables the integration of the time series models from the transformers library (https://github.com/sktime/sktime/issues/5790). Since sktime has a predict_proba method, I would like to translate the transformers probability distribution into sktime probability distributions.

Your contribution

I think there are at least three solutions:

  1. Add a new method, e.g., predict_distribution, which is returning the distribution object or its parameters.
  2. Add a parameter to the generate function that controls if the distribution or its parameters are returned.
  3. Always returns the distributions or it's parameters together with the sampled sequence in generate.

If you are preferring any of these three solutions, I am happy to implement it :)

amyeroberts commented 6 months ago

WDYT @kashif ?

kashif commented 6 months ago

right so i think the most useful would be that we return in the generated output datasclass the param. of the distribution for each time step... this way one can reconstruct the distribution... generate of course internally will need samples for the autoregressive generation to feed the next time point... so i think 3 would be the easiest.

Would the seq of return param be fine for you @benHeid ?

kashif commented 6 months ago

so we can extend SampleTSPredictionOutput to also contain optional parameters ... the one caveat is that different parametric distribution heads have different numbers of parameters, gaussian is 2, student-t is 3 and neg. bin is also 3 per time step

benHeid commented 6 months ago

Returning a sequence of params would be fine for me. This would enable me than to just pass these parameters to the sktime distribution objects.

I try to create a corresponding PR this weekend :)