jdb78 / pytorch-forecasting

Time series forecasting with PyTorch
https://pytorch-forecasting.readthedocs.io/
MIT License
3.87k stars 611 forks source link

RAM memory getting filled on training #286

Closed pnmartinez closed 3 years ago

pnmartinez commented 3 years ago

Expected behavior

Following the N-beats tutorial on the docs, but with my own dataset.

It's a large dataset (+1.5M rows), but I've trimmed it to 100k for dev purposes.

Actual behavior

RAM gets filled while training epochs, even when using gpus = 1, leading to a Kernel crash.

I wonder if, under the hood, either Pytorch Lightning of Pytorch forecasting make the classic mistake of loss += loss, keeping all the parameters in memory, instead of the adviced loss += loss.item().

jdb78 commented 3 years ago

To narrow down the problem, do you you have the same issue when running with log_interval=-1?

pnmartinez commented 3 years ago

Hi,

I am a bit of a newcomer to Pytorch, so can you specify what method is this a parameter of? .fit(log_interval=-1)?

I've done some googling, but can't seem to find it.

Thanks in advance.


From: Jan Beitner notifications@github.com Sent: Sunday, January 24, 2021 6:22:07 PM To: jdb78/pytorch-forecasting pytorch-forecasting@noreply.github.com Cc: Pablo pablonavaber@hotmail.com; Author author@noreply.github.com Subject: Re: [jdb78/pytorch-forecasting] RAM memory getting filled on training (#286)

To narrow down the problem, do you you have the same issue when running with log_interval=-1?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/jdb78/pytorch-forecasting/issues/286#issuecomment-766397896, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHEB2L3RE6ZUCBSDQKRPAWDS3RJL7ANCNFSM4WQTOH7A.

jdb78 commented 3 years ago

It is a NBeats init parameter. So use it in the from_dataset method

pnmartinez commented 3 years ago

Amazingly, haven't had that problem in a new run of the same notebook today.

Used log_interval = -1 when building the net: should it be responsible?

I have one more question: is Nbeats forecasting (image below) all the prediction steps at the same time (multi-step prediction), or is it doing a single-step prediction at every point, and then collecting them for the plot?

imagen

jdb78 commented 3 years ago

It is a multi-step architecture.

Question: If you are using log_interval > 0, can you reproduce the memory issue? If yes, this is a bug we need to fix.