Open Seegee opened 1 year ago
no, the model needs to be converted from onnx to openvino, and some code changes have to be made. I don't think it happen until we can document how to convert custom models from onnx to IR like Shadowpower did for wd-1.3
no, the model needs to be converted from onnx to openvino, and some code changes have to be made. I don't think it happen until we can document how to convert custom models from onnx to IR like Shadowpower did for wd-1.3
What if we convert the model ourselves?
What if we convert the model ourselves?
Yes, you can. I did it for SD2.1 and it works fine. You can even convert it with dynamic axes to set arbitrary resolution, not only 512x512. But, of course, it is way slower.
Yes, you can. I did it for SD2.1 and it works fine.
Mind telling us how to do that?
Yes, you can. I did it for SD2.1 and it works fine.
Mind telling us how to do that?
Sure. Mentioned this already somewhere here. It's an official example from the OpenVINO team: https://github.com/openvinotoolkit/openvino_notebooks/blob/main/notebooks/225-stable-diffusion-text-to-image/225-stable-diffusion-text-to-image.ipynb Although you would have to modify it for your model or if you need img2img to add the VAE encoder.
We updated the notebooks and so the converted IR will work directly with this demo.
@RedAndr so SD2.1 can be converted using the conversion steps mentioned in https://github.com/openvinotoolkit/openvino_notebooks/blob/main/notebooks/225-stable-diffusion-text-to-image/225-stable-diffusion-text-to-image.ipynb ?
@arisha07 Yes, it's the same. You can change the resolution since SD2.1 native resolution is 768x768. It could be even higher but requires more memory. The only difference with SD1.4 is prediction_type, which must be 'v_prediction' instead of 'epsilon'.
Thank you so much for your response. Had few more questions - Did you keep the same tokenizer - 'openai/clip-vit-large-patch14' from before ? I thought there was a change with SD2/SD2.1 Also where did you make the prediction_type change ?
Yes, I did change the tokenizer to open_clip one, namely ViT-H-14, but I didn't notice a difference and revert it back. It could be just in my case, so you'd better experiment yourself. However, it requires some code changes as far as I remember.
The prediction_type is a parameter in a scheduler, for example: LMSDiscreteScheduler( ..., prediction_type='v_prediction')
Okay I was able to get SD2.1 converted. Few changes that I had to make in the notebook from openvino was
But when adding prediction_type='v_prediction' to the openvino's notebook the generated image looked corrupted. @RedAndr did you see the same behavior ? Also when you were using open_clip's ViT-H-14 tokenizer did you make any changes to the following lines ?
tokens = self.tokenizer(
prompt,
padding="max_length",
max_length=self.tokenizer.model_max_length,
truncation=True
).input_ids
Thank you once again for your help :)
@arisha07 I didn't use the notebook to generate any images, sorry. No idea why it doesn't work. The code looks fine to me.
Yes, I did change the parameter for the tokenizer since it's different. In fact, I removed all of them: tokenizer = open_clip.get_tokenizer('ViT-H-14') tokens = tokenizer(prompt).tolist()[0]
No problem, glad it helped.
@RedAndr I was just curious to know if you ever tried converting "stabilityai/stable-diffusion-2-depth" to openvino IRs ?
@arisha07 No, I didn't, but I guess it shouldn't be a problem.
I think the unet weight's shape changes with this model..so might need some modifications.
@RedAndr so SD2.1 can be converted using the conversion steps mentioned in https://github.com/openvinotoolkit/openvino_notebooks/blob/main/notebooks/225-stable-diffusion-text-to-image/225-stable-diffusion-text-to-image.ipynb ?
I tried to get this work on Arc GPUs, apparently I don't know what I'm doing with jupyter notebooks or just in general, and it didn't work. LOL. I would love for anyone to help me get OpenVINO with Arc working with Stable Diffusion 2.1 768x768 models, with a functional webui. I found this DirectML / ONNX project that actually ran on Arc... but generated garbage output. https://github.com/Amblyopius/Stable-Diffusion-ONNX-FP16
And just to be clear, I'm doing this for benchmarking and journalistic purposes. I wrote this article and I want to update it with "better" Stable Diffusion 2.1 testing. Drop me a note or email if you have working instructions for getting:
A functional web UI similar to Automatic 1111 so I can specify prompt, negative prompt, output resolution, steps, batch size, and batch count
Thanks! —Jarred
Is it possible to drop in the model for stablediffusion 1.5, or 2.0?