Open chuanli11 opened 2 years ago
It is not ready to be merged because the onnx model does not speed things up for GPU. This is not specifically for sd_image
but also for the reference huggingface model. See issue here.
The onnx model does give ~35% iteration time when running on CPU. However it is still too slow to be used (GPU can be two orders of magnitude faster)
Model Format | CPU | CUDA RTX 8000 | CUDA RTX 8000 + autocast |
---|---|---|---|
PyTorch | 10.16s/it | 4.57it/s | 8.92 it/s |
Onnx | 6.70s/it | 2.23it/s | N/A |
The current onnx pipeline only use onnx for vae and unet, and keep the other parts of the model as PyTorch ckpts
This is due to
get_image_features
is not a nn.Module
so doesn't work with onnx_export
. Similar discussion found here. Tried replacing it with CLIPVisionModel, was able to export but caused another error that is related to "TypeError: forward() takes 1 positional argument but 2 were given" during inference.safety_checker
onnx model doesn't work with batch size > 1
. Haven't looked into the reason.However, having these modules running in PyTorch (instead of onnx) doesn't seem to have much of a impact on the speed, since the most expensive compute is the diffusion step (unet) and those numbers of the sd_image
model (this table) matches the numbers of the reference huggingface model (table below), despite all modules in the huggingface model can be converted into onnx.
Model Format | CPU | CUDA RTX 8000 | CUDA RTX 8000 + autocast |
---|---|---|---|
PyTorch | 10.16s/it | 4.56it/s | 8.78 it/s |
Onnx | 6.64s/it | 2.21it/s | N/A |
Add onnx support to
sd_image
model.sd_image
model to onnx