orion-junkins / river-level-forecasting

Exploring various neural network architectures for river level forecasting
MIT License
3 stars 3 forks source link

OOM with Too Many Coordinates #81

Closed douglasdennis closed 1 year ago

douglasdennis commented 1 year ago

When there are too many coordinates for a catchment to train on then an OOM will happen. This is coming from TrainingDataset loading all training in at the beginning. We will need to stream training data in, one or two coordinates at a time instead of loading all of them at once in the TrainingDataset class.

douglasdennis commented 1 year ago

I'm a goober. Streaming data in won't help since all of the datasets are merged together into a single large one. Investigating #83 may help with OOMs (effectively cut the memory use by data in half). Another, less fun option, is to run through the pipeline and look for leaks.

douglasdennis commented 1 year ago

Actually, I did find a way to stream in data. However, it requires the ensemble model to be aware of it. This is further evidence to do a custom ensembler.

douglasdennis commented 1 year ago

This was found to no longer be an issue on HPC. Additionally, upstream dependencies appear to have reduced their own memory issue. Closing.