xuduo35 / MakeLongVideo

Implementation of long video generation
MIT License
77 stars 8 forks source link

ValueError: cross_attention_dim must be specified for CrossAttnDownBlock2D #10

Open 9e0 opened 7 months ago

9e0 commented 7 months ago

Hi xuduo35, I am getting the below error while running infer.py script. I had downloaded and placed all required files in correct paths. Please help in fixing the below error.

Exception has occurred: ValueError cross_attention_dim must be specified for CrossAttnDownBlock2D File "E:\WorkArea\Services\Python\MakeLongVideo\infer.py", line 64, in pipeline = MakeLongVideoPipeline.from_pretrained(pretrained_model_path, unet=unet, torch_dtype=torch.float16).to("cuda") ValueError: cross_attention_dim must be specified for CrossAttnDownBlock2D

Thanks

xuduo35 commented 7 months ago

Could be caused by different pip library, you can try pip install --upgrade diffusers==0.11.1 and pip install --upgrade transformers==4.25.1.

accelerate==0.16.0 bitsandbytes==0.37.1 decord==0.6.0 diffusers==0.11.1 einops==0.6.0 imageio==2.14.1 imageio-ffmpeg==0.4.7 omegaconf==2.1.1 tensorboard==2.12.0 tensorboard-data-server==0.7.0 tensorboard-plugin-wit==1.8.1 torch==2.0.0 torchvision==0.15.1 transformers==4.25.1 xformers==0.0.17rc482 webdataset==0.2.48

9e0 commented 7 months ago

Hi xuduo35,

Thanks for quick reply.

I am having all the package versions, as mentioned above. Still getting the error. I tried various solutions mentioned online. None worked.

xuduo35 commented 7 months ago

How about adding some debug lines to track down this issue? If you are using the same packages, I don't know what else is missing. Are you using stable-diffusion-v1-4 pretrained weights?