Open Karthik-Dulam opened 1 year ago
You may want to check this out: https://github.com/bes-dev/stable_diffusion.openvino/pull/58/commits/5d8b9ddfef8245b6c829f9f75b3b67025e7c8f5b
I have a solution in my fork that keeps the model in memory between prompts: https://github.com/Drake53/stable_diffusion.openvino/commit/1862800dc4fc220dd81f1f22d6b2bf1ad36eb9b2
What is the recommended way to batch process the prompts? To avoid loading and unloading the model into memory for every prompt. This is very wasteful and time consuming, especially on devices with less than 10GB RAM, since it uses SWAP and writing to a disk is terribly slow.
Ideally one would want to process multiple prompts and generate images in sequence without freeing the model from memory. I have seen that this is possible in some Colab Notebooks, they were using GPUs though.