MarcBresson commented 5 months ago

feature request

Is your feature request related to a current problem? Please describe. I once read an issue on GluonTS repository about why they are using both the batch_size and the number_of_batch_per_epoch (effectively fixing the number of samples per epoch). They argu that, sometimes, with panel time series, we can have extremely large dataset. Fixing both parameters was then a mean to avoid long training phase by limiting the number of samples seen per epoch.

Describe proposed solution A new parameter number_of_batch_per_epoch could be added as a way to fix the number of samples per epoch. Alternatively, we could have a number_of_sample_per_epoch in which case the number_of_batch_per_epoch would be automatically computed.

This would pose issues if the number of samples is larger than the dataset: should be throw an error, or just a warning but we still train on the entire dataset?

dennisbader commented 5 months ago

Hi @MarcBresson, we already support this. Have a look at this response.

MarcBresson commented 5 months ago

Thank you for your quick answer ! I didn't know pytorch lightning supported that. I will dive deeper to understand how it does the subsampling.

I stumbled upon #1348 that was mentioned in #2129 and I think my feature request was exactly the same. Should #1348 be closed too by stating that it can be done with pytorch lightning ?

madtoinou commented 5 months ago

1348 refers to the way the `samples` are obtained from the time-series at the dataset level, not the number of samples processed during one epoch at the trainer level. And it should answer your interrogation about how is the subsampling performed ;)

Closing this one as the original question was answered (hooray for lightning).

unit8co / darts

Add `number_of_batch_per_epoch` parameters for torch forecasting models #2190

feature request

1348 refers to the way the `samples` are obtained from the time-series at the dataset level, not the number of samples processed during one epoch at the trainer level. And it should answer your interrogation about how is the subsampling performed ;)

unit8co / darts

Add `number_of_batch_per_epoch` parameters for torch forecasting models #2190

feature request

1348 refers to the way the samples are obtained from the time-series at the dataset level, not the number of samples processed during one epoch at the trainer level. And it should answer your interrogation about how is the subsampling performed ;)

1348 refers to the way the `samples` are obtained from the time-series at the dataset level, not the number of samples processed during one epoch at the trainer level. And it should answer your interrogation about how is the subsampling performed ;)