huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
26.2k stars 5.4k forks source link

StableVideoDiffusionPipeline cannot use from_single_file #6839

Open Meatfucker opened 9 months ago

Meatfucker commented 9 months ago

Describe the bug

The StableVideoDiffusionPipeline cannot load models in any format other than diffusers, which is problematic as the latest StableVideoDiffusion model has only been released in safetensors. See: https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt-1-1

Reproduction

svd_pipe = StableVideoDiffusionPipeline.from_single_file("svdxt.safetensors", torch_dtype=torch.float16, variant="fp16")

Logs

Traceback (most recent call last):
  File "D:\ML\apps\svdeez\svdeez.py", line 170, in <module>
    svdeez = SvdeezGUI()
  File "D:\ML\apps\svdeez\svdeez.py", line 16, in __init__
    self.svd_pipe = StableVideoDiffusionPipeline.from_single_file("svdxt.safetensors", torch_dtype=torch.float16, variant="fp16")
AttributeError: type object 'StableVideoDiffusionPipeline' has no attribute 'from_single_file'

System Info

Who can help?

No response

a-r-r-o-w commented 9 months ago

The method described in this comment might work if you want to perform the weights conversion to diffusers-expected format.

However, there are a few new checkpoints that have been released and having an improved version of the conversion script as well as .from_single_file mixin implementation would be ideal. @patil-suraj @DN6

Meatfucker commented 9 months ago

Thanks Ill give that script a shot. Was hoping to avoid manually converting things myself but I really want to try the model so appreciate it. Also thank you for tagging the appropriate people as I wasnt quite sure who.

Yanting-K commented 9 months ago

You can use this file to transform this model

https://github.com/Yanting-K/diffusers/blob/main/scripts/convert_svd_to_diffusers.py

tin2tin commented 9 months ago

Getting StableVideoDiffusionPipeline.from_single_file surely wouldn't hurt, as this seems to be the only way to import the new SVD 1.1 release with Diffusers without a conversion. Ex. from here: https://huggingface.co/vdo/stable-video-diffusion-img2vid-fp16/tree/main

a-r-r-o-w commented 9 months ago

Getting StableVideoDiffusionPipeline.from_single_file surly wouldn't hurt, as this seems to be the only way to import the new SVD 1.1 release with Diffusers without a conversion.

Yes, there's a few checkpoints that are out now. I'm working on the single file model import in #6844.

DN6 commented 9 months ago

Diffusers format checkpoints have been merged in the stability model repo. https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt-1-1

The model can be loaded using from_pretrained

tin2tin commented 9 months ago

There is one more hindrance, so it can't be downloaded through Diffusers: Repo model stabilityai/stable-video-diffusion-img2vid-xt-1-1 is gated. You must be authenticated to access it.

Meatfucker commented 9 months ago

You can download it via diffusers after authing on the model card page.

tin2tin commented 9 months ago

This works: https://huggingface.co/vdo/stable-video-diffusion-img2vid-xt-1-1

yiyixuxu commented 9 months ago

we can close this one now the diffusers checkpoint has been added?

Meatfucker commented 8 months ago

I think its likely to come up again the next time a new video model gets released. It would probably be ideal to make from_single_file work in the long term.

CallMeFrozenBanana commented 8 months ago

you may try to enter: $ huggingface-cli login and set the token from huggingface

github-actions[bot] commented 7 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.