Doubts about geglu implementation

chengzeyi / stable-fast

Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.

MIT License

1.19k stars 73 forks source link

Open capyun opened 3 months ago

capyun commented 3 months ago

Why is the acceleration solution used only when the data type is fp16 and bfp16, and the pytoch solution is used for others?

chengzeyi commented 3 months ago

Because testing float32 with tf32 would cost more time.