Open souvikqb opened 1 month ago
is the slow inference speed due to the diffusion model or the others? BTW, you should share your inference code for me to reference.
is the slow inference speed due to the diffusion model or the others? BTW, you should share your inference code for me to reference.
I'm directly using this code -
import torch
from diffusers import Transformer2DModel, PixArtSigmaPipeline
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
weight_dtype = torch.float16
transformer = Transformer2DModel.from_pretrained(
"PixArt-alpha/PixArt-Sigma-XL-2-1024-MS",
subfolder='transformer',
torch_dtype=weight_dtype,
use_safetensors=True,
)
pipe = PixArtSigmaPipeline.from_pretrained(
"PixArt-alpha/pixart_sigma_sdxlvae_T5_diffusers",
transformer=transformer,
torch_dtype=weight_dtype,
use_safetensors=True,
)
pipe.to(device)
prompt = "A small cactus with a happy face in the Sahara desert."
image = pipe(prompt).images[0]
image.save("./catcus.png")
Cool. And what about the inference speed?
Cool. And what about the inference speed?
On an A10, this is the avg inference speed I'm getting for various sample/step combinations -
It's strange since I can generate within 8 seconds even with V100 GPUs. Is the V100 stronger than A10?
It's strange since I can generate within 8 seconds even with V100 GPUs. Is the V100 stronger than A10?
The inference code is the same as yours. I'd suspect if the env is different or else. I haven't tried on the Deepcache until now. If possible, I would really appreciate it if you could Pull a request.
The inference code is the same as yours. I'd suspect if the env is different or else. I haven't tried on the Deepcache until now. If possible, I would really appreciate it if you could Pull a request.
Unfortunately, Deepcache isn't supported for Pix-Art-Sigma it seems, and only for Stable Diffusion models. I'm still linking it here -
Would definitely appreciate a more optimised version of Pix-Art-Sigma cause the image quality is really superior.
The inference code is the same as yours. I'd suspect if the env is different or else. I haven't tried on the Deepcache until now. If possible, I would really appreciate it if you could Pull a request.
Unfortunately, Deepcache isn't supported for Pix-Art-Sigma it seems, and only for Stable Diffusion models. I'm still linking it here -
- https://www.reddit.com/r/StableDiffusion/comments/18b40hh/deepcache_accelerating_diffusion_models_for_free/
- https://github.com/horseee/DeepCache
Would definitely appreciate a more optimised version of Pix-Art-Sigma cause the image quality is really superior.
@lawrence-cj Did you get to try any optimisation techniques?
I haven't.
https://github.com/horseee/learning-to-cache/tree/main this looks pretty interesting.
Hi the model is amazing to use but the inference speed is quite slow on an A10 GPU. I saw decent performance on A100 though.
Is there any optimisation method I can apply to speed it up ?