Picsart-AI-Research / Text2Video-Zero

[ICCV 2023 Oral] Text-to-Image Diffusion Models are Zero-Shot Video Generators
https://text2video-zero.github.io/
Other
3.91k stars 336 forks source link

ValueError: cross_attention_dim must be specified for CrossAttnDownBlock2D #67

Closed Note-Liu closed 10 months ago

Note-Liu commented 10 months ago

/workspace/Text2Video-Zero-main/infer.py:11 in │ │ │ │ 8 params = {"t0": 44, "t1": 47 , "motion_field_strength_x" : 12, "motion_field_strength_y" │ │ 9 │ │ 10 outpath, fps = f"./text2video{prompt.replace(' ','_')}.mp4", 4 │ │ ❱ 11 model.process_text2video(prompt, fps = fps, path = out_path, params) │ │ 12 │ │ │ │ /workspace/dText2Video-Zero-main/model.py:458 in process_text2video │ │ │ │ 455 │ │ │ unet = UNet2DConditionModel.from_pretrained( │ │ 456 │ │ │ │ model_name, subfolder="unet") │ │ 457 │ │ │ print("") │ │ ❱ 458 │ │ │ self.set_model(ModelType.Text2Video, │ │ 459 │ │ │ │ │ │ model_id=model_name, unet=unet) │ │ 460 │ │ │ print("") │ │ 461 │ │ │ self.pipe.scheduler = DDIMScheduler.from_config( │ │ │ │ /workspace/Text2Video-Zero-main/model.py:61 in set_model │ │ │ │ 58 │ │ torch.cuda.empty_cache() │ │ 59 │ │ gc.collect() │ │ 60 │ │ safety_checker = kwargs.pop('safety_checker', None) │ │ ❱ 61 │ │ self.pipe = self.pipe_dict[model_type].from_pretrained( │ │ 62 │ │ │ model_id, safety_checker=safety_checker, kwargs).to(self.device).to(self.d │ │ 63 │ │ self.model_type = model_type │ │ 64 │ │ self.model_name = model_id
' ......

│ /lib/python3.10/site-packages/diffusers/models/vae.py:71 in init │ │ │ │ 68 │ │ │ output_channel = block_out_channels[i] │ │ 69 │ │ │ is_final_block = i == len(block_out_channels) - 1 │ │ 70 │ │ │ │ │ ❱ 71 │ │ │ down_block = get_down_block( │ │ 72 │ │ │ │ down_block_type, │ │ 73 │ │ │ │ num_layers=self.layers_per_block, │ │ 74 │ │ │ │ in_channels=input_channel, │ │ │ │ /lib/python3.10/site-packages/diffusers/models/unet_2d_blocks.py:94 in get_down_block │ │ │ │ 91 │ │ ) │ │ 92 │ elif down_block_type == "CrossAttnDownBlock2D": │ │ 93 │ │ if cross_attention_dim is None: │ │ ❱ 94 │ │ │ raise ValueError("cross_attention_dim must be specified for CrossAttnDownBlo │ │ 95 │ │ return CrossAttnDownBlock2D( │ │ 96 │ │ │ num_layers=num_layers, │ │ 97 │ │ │ in_channels=in_channels,

ValueError: cross_attention_dim must be specified for CrossAttnDownBlock2D

zengjie617789 commented 8 months ago

refer this