kabachuha / sd-webui-text2video

Auto1111 extension implementing text2video diffusion models (like ModelScope or VideoCrafter) using only Auto1111 webui dependencies
Other
1.28k stars 107 forks source link

[Feature Request]: xformers support #30

Closed sALTaccount closed 1 year ago

sALTaccount commented 1 year ago

Is there an existing issue for this?

What would your feature do ?

Add support for xformers memory effecient attention

Proposed workflow

This is something that I am working on right now. Still ironing out some bugs, but once I get it working I will make a PR

Additional information

No response

hithereai commented 1 year ago

Thanks for working on this!

sALTaccount commented 1 year ago

Not quite sure why, but I'm not getting the huge speedups that we see in Stable Diffusion when using xformers. Its a speedup for sure, but only around 20%. I'm working with a guy to make our own repo to base a trainer off of, so you can check my commit there https://github.com/lopho/sd-video/pull/2 and implement this into your code if you would like @hithereai @kabachuha (i don't really know how to make webui extensions otherwise I'd make a PR). Going to keep looking into this but it could be that the major slowdowns in text2video aren't from attention but rather from something else.

I tried messing with the xformers op by using what Huggingface uses, webui uses, and hardcoding it to flash triton, but wasn't able to get more than the 20% speedup, so I just left it at default options which is the same 20% speedup.

kabachuha commented 1 year ago

@sALTaccount oh, thanks! We're going to look into it then. 20% speedoff is better than nothing, anyway. Probably, it's less due to the model gaining an additional dimension

sALTaccount commented 1 year ago

I had somebody else take a look at it and they said that this is probably all of the extra speed that can be gotten out of using xformers. It just isn't a massive speedup for t2v