showlab / Tune-A-Video

[ICCV 2023] Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
https://tuneavideo.github.io
Apache License 2.0
4.15k stars 376 forks source link

Fix mid block checker to fix custom model loading #13

Open Dango233 opened 1 year ago

Dango233 commented 1 year ago

The from_pretrained_2d runs into ValueError: unknown mid_block_type: UNetMidBlock2DCrossAttn when "mid_block_type": "UNetMidBlock2DCrossAttn" is specified in the config file of pertained unet.

This happens with some custom models (model converters could be a cause) but won't happen with the official models.

This PR fixes the problem by allowing UNetMidBlock2DCrossAttn in the type check.

zhangjiewu commented 1 year ago

just wondering what kind of custom models you are using?

Dango233 commented 1 year ago

I used a model trained under the CompVis/Stable-diffusion format. I got it converted to diffusers format using this conversion script:

https://github.com/huggingface/diffusers/blob/main/scripts/convert_original_stable_diffusion_to_diffusers.py

Dango233 commented 1 year ago

Seems that the config saved using diffusers' save_pretrained method will have this problem

kangqiyue commented 1 year ago

Thanks! cool !