kuprel / min-dalle

min(DALL·E) is a fast, minimal port of DALL·E Mini to PyTorch
MIT License
3.48k stars 256 forks source link

Fixes #16 - mega model running out of memory #57

Closed w4ffl35 closed 2 years ago

w4ffl35 commented 2 years ago

Prior to this fix the mega model will (more often than not) fail when running in a loop.

Clearing the cache seems to fix the issue.

w4ffl35 commented 2 years ago

A side note on this fix: depending on your video card it is still possible to run out of memory even with this code in place while running mega (if you do anything intensive on the GPU, that includes multiple tabs in firefox or a video on youtube). I am testing locally on an RTX 2080s (8g gpu ram) on debian and monitoring my GPU usage using nvtop with very few applications running.

YukiSakuma commented 2 years ago

Hmm right now my solution is adding this at the cell after

display(model.generate_image(text, seed, grid_size))
torch.cuda.empty_cache()
kuprel commented 2 years ago

Thanks for this. They are having issues on replicate too where they have to restart the server every 10 minutes. I wonder if clearing cuda cache would fix that problem

w4ffl35 commented 2 years ago

Thanks for this. They are having issues on replicate too where they have to restart the server every 10 minutes. I wonder if clearing cuda cache would fix that problem

oh interesting, i'd love to find out if this helps. there are probably solutions that can be combined with this surrounding PYTORCH_CUDA_ALLOC_CONF which is mentioned in the error messages that pop up when running out of memory