Open LeptonWu opened 3 weeks ago
Model caching reduces the startup time from minutes to a few seconds if a cache for the model already exists.
There is a wiki page for Model Caching: https://github.com/vladmandic/automatic/wiki/OpenVINO#model-caching
i'm open to suggestions, but disabling cache by default is not a likely one. anyhow, converting this to feature request.
I will come with more data since I didn't remember I noticed unusual latency with openvino cached disabled. But it seems whenever you change things like width and height, openvino have to regenerate the model? That could be the issue here.
Are we doing any kind of GC for the cache?
Or maybe just check the free disk size and don't write cache if the free disk is less than some reasonable value?
Or just give some warning on the UI to let they know?
I noticed the disk was full just by an error message that the generated image can't be save because of disk full which is totally out of my expectation....
Did a quick test with one computer, hardware: CPU: AMD 3700X GPU: Intel ARC A580
Fresh start time ./sd_ov.py
with cache dir removed, set cache in program to "False"
Run it multiple times, it always looks like this:
real 0m19.838s user 0m54.426s sys 0m8.498s
Then set cache in program to "True" and run time ./sd_ov.py
multiple times, it always looks like this:
real 0m14.622s user 0m43.020s sys 0m9.025s
The difference is only 5 seconds. Also, actually the first time run with cache on is always around 24 seconds, I think the reason is that it need time to write the disk, the generated cache dir is around 7G for the 2G checkpoint used in this test. It seems it need 10 seconds to flush 7G data to my ssd.
#!/home/leptonwu/automatic/venv_openvino/bin/python
import torch
import openvino.torch
from openvino.runtime import Core
from openvino import properties
from diffusers import StableDiffusionPipeline
device='GPU'
cache=True
Core().set_property({properties.cache_dir: "cache"})
pipe = StableDiffusionPipeline.from_single_file(
"realisticVisionV60B1_v51HyperVAE.safetensors")
pipe.unet = torch.compile(
pipe.unet,
backend="openvino",
options={"device": device, "model_caching": cache},
)
torch.xpu.manual_seed_all(0)
image = pipe("black cat", guidance_scale=1.5, width=512, height=512, num_inference_steps=6)
image.images[0].save("out.jpg")
Issue Description
I am trying with cpu diffusion and it seems openvino give me best performance, basically I can generate a 512x512 image with recommended settings from Realistic Vision V6.0 B1 for around 40 seconds on a 5700G. The only issues is when I am trying different models and configuration, the cache directory keep increasing and finally I am runnning out of disk space. (I had around 65G free space after installation and cache increased to 65G). I did some investigation and it seems by default the disk cache is on for openvino. I am not sure if the cache really matter a lot since I didn't notice obvious performance issue after I disable caching.
Maybe we should disable openvino model cache by default. Or at least we should say it explicitly somewhere to remind users it could increase rapidly if users are keep trying different models etc.
Version Platform Description
No response
Relevant log output
No response
Backend
Diffusers
UI
Standard
Branch
Master
Model
StableDiffusion 1.5
Acknowledgements