libffcv / ffcv

FFCV: Fast Forward Computer Vision (and other ML workloads!)
https://ffcv.io
Apache License 2.0
2.79k stars 180 forks source link

Reuse Existing Cuda Streams each Epoch #308

Open warner-benjamin opened 1 year ago

warner-benjamin commented 1 year ago

This PR modifies the FFCV Loader and EpochIterator to create one set of Cuda streams and reuse them each epoch instead of creating new Cuda streams every epoch.

The current method of creating new Cuda streams every epoch can cause increasing memory allocation if using a GPU transform, as each epoch the GPU transform will allocate new memory in the new Cuda stream. This doesn't cause any errors, as the prior epochs' allocation can be reused. But it does make keeping track of GPU memory usage more difficult and can hide real memory overflow errors.

I don't think this will cause any issues with distributed training, but am unable to test.

If wanted, I can create a flag like recompile to recreate Cuda streams every epoch to match current behavior.