Open wxsms opened 3 weeks ago
@wxsma could you try something like this?
backbone.eval()
with torch.no_grad():
modelopt_export_sd(backbone, f"{str(args.onnx_dir)}", args.model, args.format)
And also move other parts to cpu, like vae and clip. Please let me know if it works.
@wxsma could you try something like this?
backbone.eval() with torch.no_grad(): modelopt_export_sd(backbone, f"{str(args.onnx_dir)}", args.model, args.format)
And also move other parts to cpu, like vae and clip. Please let me know if it works.
Sadly it does not work. I manage to export the onnx model in A800 and complie in 4090.
I'll take a look and get back to you, I barely tested on 4090. Just to confirm, can you export the FP16 SDXL on a 4090?
Thank you, I will try it later.
Has there been any progress on this issue? I encountered the same problem on an RTX 4090. Eventually, I performed the ONNX model conversion on an A800. Using nvidia-smi, I noticed that the ONNX conversion process requires around 30GB of VRAM
model:SDXL-1.0
I'm building a SDXL model in float16 using 4090x2, therefore the GPU memory available is ~48GB.
however, the script in
diffusers/quantizatoin
does not looks like to able to use both of them, and raise OOM error while exporting onnx model.I tried to export the model using CPU, but it's too slow.