GPU memory purging - Githubissues

Indeed, that GPU memory isn't freed completely after a "Generate" run is a huge problem. Even with 24G of GPU memory the GPU is practically completely blocked after a "Generate" run, so it's for example not possible to use even a smaller LLM in local ollama server to let it optimize prompts or answer other kinds of questions.

Even if the actual plan is to keep as much cached as possible, forge ui needs to get a configuration option that let us force a GPU memory freeing after each "Generate" run.

Despite all the other problems that the actual GPU memory management has like not freeing memory when it should, segmentation faults from time to time, even corrupted pycache files after "Generate" runs, even if the run before was successful.

lllyasviel / stable-diffusion-webui-forge

GPU memory purging #1937