city96 / ComfyUI-GGUF

GGUF Quantization support for native ComfyUI models
Apache License 2.0
723 stars 41 forks source link

Recent OPS refactor PR causes memory issues with --lowvram/--disable-smart-memory #93

Closed RandomGitUser321 closed 1 week ago

RandomGitUser321 commented 2 weeks ago

It will cause the models to consume a lot more system memory. With a basic workflow using Flux Q8_0 and T5 Q5_K_M, I'm sitting at 28.8gb/32gb after inference.

More info in https://github.com/city96/ComfyUI-GGUF/pull/92

RandomGitUser321 commented 2 weeks ago

After both 717a0e1 and c8923a4, the problem seems fixed and it survived the gauntlet of switching models a couple times. System memory usage is down ~8gb vs before (before: ~29/32gb, now: ~21/32gb). This is with an 8gb GPU and with --disable-smart-memory.

city96 commented 1 week ago

Closing this since it's technically fixed, though there's still a spike during the initial load on windows.

RandomGitUser321 commented 1 week ago

Good call, I forgot to close it after that PR.