siliconflow / onediff

OneDiff: An out-of-the-box acceleration library for diffusion models.
https://github.com/siliconflow/onediff/wiki
Apache License 2.0
1.4k stars 85 forks source link

Almost no speed up on L4 #961

Open CuddleSabe opened 1 week ago

strint commented 1 week ago

How much speed up do you get from L4.

We haven't tested on L4 yet.

CuddleSabe commented 1 week ago

How much speed up do you get from L4.

We haven't tested on L4 yet.

我在L4上编译前13s左右,编译后11s。同样的参数和尺寸在3090上可以获得将近一倍的加速

strint commented 1 week ago

How much speed up do you get from L4. We haven't tested on L4 yet.

我在L4上编译前13s左右,编译后11s。同样的参数和尺寸在3090上可以获得将近一倍的加速

L4 显存带宽只有 3090 的 1/3。

你用的是 hf diffusers 么,可以使用一下我们新的 torch 编译后端,带有 auto tuning 的功能,可以在 L4 下试验下效果: https://github.com/siliconflow/onediff/tree/main/src/onediff/infer_compiler/backends/nexfort