amazon-science / chronos-forecasting

Chronos: Pretrained (Language) Models for Probabilistic Time Series Forecasting
https://arxiv.org/abs/2403.07815
Apache License 2.0
2.02k stars 238 forks source link

What is the recommended `torch_dtype`? #103

Closed CoCoNuTeK closed 1 week ago

CoCoNuTeK commented 2 weeks ago

Hello there, what would you recommend as the best torch_dtype param?? Given the tradeoffs?? Or was the model trained only using the bfloat16?? Thanks for the answer.

abdulfatir commented 2 weeks ago

@CoCoNuTeK The models were trained with tf32 (a 19-bit CUDA floating point format that's a replacement for fp32). We recommend bf16 for inference, especially if your machine supports that. It should require less memory and be much faster that fp32. Please note that we are talking about the model's parameters (torch_dtype in the pipeline) here. DO NOT cast your time series into bf16 as that may result in loss of information.

CoCoNuTeK commented 2 weeks ago

@CoCoNuTeK The models were trained with tf32 (a 19-bit CUDA floating point format that's a replacement for fp32). We recommend bf16 for inference, especially if your machine supports that. It should require less memory and be much faster that fp32. Please note that we are talking about the model's parameters (torch_dtype in the pipeline) here. DO NOT cast your time series into bf16 as that may result in loss of information.

Ah, okay so i just keep my datapoints in format as they are, so if its stock data, i just feed them in as is, thanks for the info. And for the finetuning part should I use bf16 aswell?

abdulfatir commented 2 weeks ago

For fine-tuning, the recommended settings are in the training script which uses tf32 for training. Of course, you're free to experiment with other dtypes and hyperparameters.

P.S.: I don't want to constrain your creativity but please be mindful when applying a univariate pretrained model such as Chronos to stock data, which is often heavily influenced by external factors. :)

CoCoNuTeK commented 2 weeks ago

For fine-tuning, the recommended settings are in the training script which uses tf32 for training. Of course, you're free to experiment with other dtypes and hyperparameters.

P.S.: I don't want to constrain your creativity but please be mindful when applying a univariate pretrained model such as Chronos to stock data, which is often heavily influenced by external factors. :)

I mean long term predictions for sure, but some day trading stuff could work if i try 1 tick = 5mins lets say it could find interesting stuff hopefully, i will let you know if you want.