kabachuha / sd-webui-text2video

Auto1111 extension implementing text2video diffusion models (like ModelScope or VideoCrafter) using only Auto1111 webui dependencies
Other
1.28k stars 106 forks source link

[Feature Request]: Add support to potat1 — edit: link provided #175

Closed wandrzej closed 1 year ago

wandrzej commented 1 year ago

Is there an existing issue for this?

What would your feature do ?

There's a new model based on modelscope Potat1 https://huggingface.co/camenduru/potat1 I think it should be fairly straightforward to make it an option in the current SD WebUI extension.

Proposed workflow

  1. Go to .... I guess to the HG page https://huggingface.co/camenduru/potat1

Additional information

No response

kabachuha commented 1 year ago

If you already have ModelScope running, just download the unet model from this link and replace ModelScope's file in its directory

https://huggingface.co/DREX-Institute/potat1.pth/blob/main/text2video_pytorch_model.pth

The text encoder conversion support is going to be soon too

kabachuha commented 1 year ago

Made progress on encoder conversion, but need someone to test

https://github.com/ExponentialML/Text-To-Video-Finetuning/pull/71

kabachuha commented 1 year ago

Here's the link to the fully converted weights https://huggingface.co/kabachuha/potat1-with-text-encoder-original-format

A test shows that it's working fine:

https://github.com/kabachuha/sd-webui-text2video/assets/14872007/c7be4cb9-d21e-4ef5-87f0-eeaff1ed1fd2