Closed rnwang04 closed 3 weeks ago
Please have a try with https://github.com/intel/neural-compressor. It can calibrate a model while trying to keep accuracy.
@rnwang04 Do you mind also let us know what model you were quantizing?
@jgong5 Hi, thanks for response. I was trying to quantize unet model in stable diffusion pipeline. Below is my script for your reference if you want to reproduce (which may need to set return_dict=False manually). I also tried to replace input_sample with true input, but the result still cannot meet my needs.
import intel_extension_for_pytorch as ipex
from intel_extension_for_pytorch.quantization import prepare, convert
qconfig = ipex.quantization.default_static_qconfig
from diffusers import StableDiffusionPipeline
import torch
model_id = "CompVis/stable-diffusion-v1-4"
pipe = StableDiffusionPipeline.from_pretrained(model_id, use_auth_token=True)
prompt = "a photo of an astronaut riding a horse on mars"
original_image = pipe(prompt, guidance_scale=7.5)
original_image[0][0].save("astronaut_rides_horse_original.png")
unet = pipe.unet
user_model = unet
user_model.eval()
sample_latents = torch.randn((1, unet.in_channels, 64, 64),
generator=None,
device="cpu",
dtype=torch.float32)
input_sample=(torch.cat([sample_latents]),
torch.tensor([980], dtype=torch.long),
torch.randn(
(1, 77, 768),
generator=None,
device="cpu",
dtype=torch.float32))
prepared_model = prepare(user_model, qconfig, example_inputs=input_sample, inplace=False)
for x in [input_sample]:
print(len(x))
prepared_model(*x)
convert_model = convert(prepared_model)
with torch.no_grad():
traced_model = torch.jit.trace(convert_model, input_sample)
traced_model = torch.jit.freeze(traced_model)
# for inference
y = traced_model(*input_sample)
print(y[0].shape)
setattr(traced_model, "in_channels", 4)
setattr(traced_model, "device", torch.device('cpu'))
setattr(pipe, "unet", traced_model)
new_image = pipe(prompt, guidance_scale=7.5)
new_image[0][0].save("astronaut_rides_horse_ipex.png")
Please have a try with https://github.com/intel/neural-compressor. It can calibrate a model while trying to keep accuracy.
@jingxu10 Thanks for your quick response ! Actually I have tried inc quantization with ipex, but it failed to work. I will report this issue to inc also.
Got it. We will look into it.
I am unable to reproduce this issue from the given code snippet, even after setting return_dict=False, I am getting the following issue.
This issue is reproducible with an updated code snippet.
import intel_extension_for_pytorch as ipex
from intel_extension_for_pytorch.quantization import prepare, convert
qconfig = ipex.quantization.default_static_qconfig
from diffusers import StableDiffusionPipeline
import torch
import functools
model_id = "CompVis/stable-diffusion-v1-4"
pipe = StableDiffusionPipeline.from_pretrained(model_id, use_auth_token=True)
unet = pipe.unet
unet.forward = functools.partial(unet.forward, return_dict=False) # set return_dict=False as default
prompt = "a photo of an astronaut riding a horse on mars"
original_image = pipe(prompt, guidance_scale=7.5)
original_image[0][0].save("astronaut_rides_horse_original.png")
user_model = unet
user_model.eval()
sample_latents = torch.randn((1, unet.in_channels, 64, 64),
generator=None,
device="cpu",
dtype=torch.float32)
input_sample=(torch.cat([sample_latents]),
torch.tensor([980], dtype=torch.long),
torch.randn(
(1, 77, 768),
generator=None,
device="cpu",
dtype=torch.float32))
prepared_model = prepare(user_model, qconfig, example_inputs=input_sample, inplace=False)
for x in [input_sample]:
print(len(x))
prepared_model(*x)
convert_model = convert(prepared_model)
with torch.no_grad():
traced_model = torch.jit.trace(convert_model, input_sample, strict=False)
traced_model = torch.jit.freeze(traced_model)
# for inference
y = traced_model(*input_sample)
setattr(traced_model, "in_channels", 4)
setattr(traced_model, "device", torch.device('cpu'))
setattr(pipe, "unet", traced_model)
new_image = pipe(prompt, guidance_scale=7.5, width=512, height=512)
new_image[0][0].save("astronaut_rides_horse_ipex.png")
You also need to add some changes to the stable diffusers pipeline source code as below :
# Change this line from pipeline_stable_diffusion.py
noise_pred = self.unet(latent_model_input, t, encoder_hidden_states=text_embeddings).sample
# to
noise_pred = self.unet(latent_model_input, t, text_embeddings)[0]
cc @leslie-fang-intel @Xia-Weiwen and @XiaobingSuper
Hi, I am trying to use ipex to quantize unet model following https://github.com/intel/intel-extension-for-pytorch/blob/v1.12.0/docs/tutorials/features/int8.md. Now the model can be quantized, but the generation results become very poor. I wonder is there any method (e.g. change mode or modify some config) to avoid such low accuracy after quantization with ipex? My torch version: 1.12.1 My ipex version: 1.12.100 Thanks !