Nixtla / neuralforecast

Scalable and user friendly neural :brain: forecasting algorithms.
https://nixtlaverse.nixtla.io/neuralforecast
Apache License 2.0
2.69k stars 312 forks source link

Errors I get when trying using cubic interpolation in NHITS #1025

Closed stephanielees closed 2 weeks ago

stephanielees commented 1 month ago

What happened + What you expected to happen

I tried fitting NHITS with cubic interpolation in my Kaggle notebook, but I got this error message:

OutOfMemoryError: CUDA out of memory. Tried to allocate 18.18 GiB. GPU 1 has a total capacty of 14.75 GiB of which 14.22 GiB is free. Process 15638 has 538.00 MiB memory in use. Of the allocated memory 274.06 MiB is allocated by PyTorch, and 45.94 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

At first, I tried setting the max_split_size_mb and PYTORCH_CUDA_ALLOC_CONF, but it didn't work. Then I guessed that I need to do distributed training since there is 18GB memory that is being tried to be allocated. But I still got the same error message after specifying the strategy (ddp_notebook). I didn't set the max_split_size_mb and PYTORCH_CUDA_ALLOC_CONF anymore when I did distributed training. When the interpolation is linear and nearest, everything goes well; I don't need to specify any strategy and total devices.

Since I'm working with a time series with quite clear seasonality, I really hope I can fit the model using cubic interpolation to significantly improve my forecast.

Versions / Dependencies

I'm using Kaggle Notebook with GPU T4 x2, each has 15GB memory. Here are the libraries I'm using:

Reproduction script

trainer_kwargs = dict(accelerator='gpu', devices=-1, strategy='ddp_notebook') model = NHITS(h=OUR_HORIZON, input_size=OUR_HORIZON*2, start_padding_enabled=True, interpolation_mode='cubic', **trainer_kwargs) fcst = NeuralForecast(models=[model], freq='h') fcst.fit(pd.DataFrame({'unique_id': df_train.unique_id, 'ds': df_train.DateTime, 'y': df_train.Renewable}), val_size=OUR_HORIZON)

Issue Severity

Medium: It is a significant difficulty but I can work around it.

stephanielees commented 2 weeks ago

I somehow solved this by changing the unique_id, so I'll close this issue.