Closed w4ffl35 closed 2 years ago
A side note on this fix: depending on your video card it is still possible to run out of memory even with this code in place while running mega (if you do anything intensive on the GPU, that includes multiple tabs in firefox or a video on youtube). I am testing locally on an RTX 2080s (8g gpu ram) on debian and monitoring my GPU usage using nvtop
with very few applications running.
Hmm right now my solution is adding this at the cell after
display(model.generate_image(text, seed, grid_size))
torch.cuda.empty_cache()
Thanks for this. They are having issues on replicate too where they have to restart the server every 10 minutes. I wonder if clearing cuda cache would fix that problem
Thanks for this. They are having issues on replicate too where they have to restart the server every 10 minutes. I wonder if clearing cuda cache would fix that problem
oh interesting, i'd love to find out if this helps. there are probably solutions that can be combined with this surrounding PYTORCH_CUDA_ALLOC_CONF
which is mentioned in the error messages that pop up when running out of memory
Prior to this fix the mega model will (more often than not) fail when running in a loop.
Clearing the cache seems to fix the issue.