kabachuha / sd-webui-text2video

Auto1111 extension implementing text2video diffusion models (like ModelScope or VideoCrafter) using only Auto1111 webui dependencies
Other
1.28k stars 106 forks source link

[Bug]: Exception occurred: 'NoneType' object has no attribute 'name' #122

Closed barleyj21 closed 1 year ago

barleyj21 commented 1 year ago

Is there an existing issue for this?

Are you using the latest version of the extension?

What happened?

Just installed Git commit: 9b79cb8d (Sun Apr 16 12:17:20 2023), trying to run ModelScope (on large ~5Gb models) for a 24 frames video 256*256 and get this every time (i have 3060 12G):

text2video — The model selected is: ModelScope text2video extension for auto1111 webui Git commit: 9b79cb8d (Sun Apr 16 12:17:20 2023) Starting text2video Pipeline setup config namespace(framework='pytorch', task='text-to-video-synthesis', model={'type': 'latent-text-to-video-synthesis', 'model_args': {'ckpt_clip': 'open_clip_pytorch_model.bin', 'ckpt_unet': 'text2video_pytorch_model.pth', 'ckpt_autoencoder': 'VQGAN_autoencoder.pth', 'max_frames': 16, 'tiny_gpu': 1}, 'model_cfg': {'unet_in_dim': 4, 'unet_dim': 320, 'unet_y_dim': 768, 'unet_context_dim': 1024, 'unet_out_dim': 4, 'unet_dim_mult': [1, 2, 4, 4], 'unet_num_heads': 8, 'unet_head_dim': 64, 'unet_res_blocks': 2, 'unet_attn_scales': [1, 0.5, 0.25], 'unet_dropout': 0.1, 'temporal_attention': 'True', 'num_timesteps': 1000, 'mean_type': 'eps', 'var_type': 'fixed_small', 'loss_type': 'mse'}}, pipeline={'type': 'latent-text-to-video-synthesis'}) device cuda Working in txt2vid mode 0%| | 0/1 [00:00<?, ?it/s]Traceback (most recent call last): File "E:\automatic/extensions/sd-webui-text2video/scripts\t2v_helpers\render.py", line 24, in run vids_pack = process_modelscope(args_dict) File "E:\automatic/extensions/sd-webui-text2video/scripts\modelscope\process_modelscope.py", line 145, in process_modelscope print("Received an image for inpainting", args.inpainting_image.name) AttributeError: 'NoneType' object has no attribute 'name' Exception occurred: 'NoneType' object has no attribute 'name'


If I try to change size to 512*512 on ModelScope then I get another exception:

text2video — The model selected is: ModelScope text2video extension for auto1111 webui Git commit: 9b79cb8d (Sun Apr 16 12:17:20 2023) Starting text2video Pipeline setup config namespace(framework='pytorch', task='text-to-video-synthesis', model={'type': 'latent-text-to-video-synthesis', 'model_args': {'ckpt_clip': 'open_clip_pytorch_model.bin', 'ckpt_unet': 'text2video_pytorch_model.pth', 'ckpt_autoencoder': 'VQGAN_autoencoder.pth', 'max_frames': 16, 'tiny_gpu': 1}, 'model_cfg': {'unet_in_dim': 4, 'unet_dim': 320, 'unet_y_dim': 768, 'unet_context_dim': 1024, 'unet_out_dim': 4, 'unet_dim_mult': [1, 2, 4, 4], 'unet_num_heads': 8, 'unet_head_dim': 64, 'unet_res_blocks': 2, 'unet_attn_scales': [1, 0.5, 0.25], 'unet_dropout': 0.1, 'temporal_attention': 'True', 'num_timesteps': 1000, 'mean_type': 'eps', 'var_type': 'fixed_small', 'loss_type': 'mse'}}, pipeline={'type': 'latent-text-to-video-synthesis'}) device cuda Working in txt2vid mode 0%| | 0/1 [00:00<?, ?it/s]Received an image for inpainting C:\Users\barle\AppData\Local\Temp\8e20ccecaa430e3168e76daa5a09d596b98502be\00005-3882147318.png Converted the frames to tensor (1, 48, 3, 512, 512) Computing latents STARTING VAE ON GPU VAE HALVED Traceback (most recent call last): File "E:\automatic/extensions/sd-webui-text2video/scripts\t2v_helpers\render.py", line 24, in run vids_pack = process_modelscope(args_dict) File "E:\automatic/extensions/sd-webui-text2video/scripts\modelscope\process_modelscope.py", line 193, in processmodelscope samples, = pipe.infer(args.prompt, args.n_prompt, args.steps, args.frames, args.seed + batch if args.seed != -1 else -1, args.cfg_scale, File "E:\automatic/extensions/sd-webui-text2video/scripts\modelscope\t2v_pipeline.py", line 216, in infer c, uc = self.preprocess(prompt, n_prompt, steps) File "E:\automatic/extensions/sd-webui-text2video/scripts\modelscope\t2v_pipeline.py", line 370, in preprocess uc = get_conds_with_caching(prompt_parser.get_learned_conditioning, self.clip_encoder, [n_prompt], steps, cached_uc) File "E:\automatic/extensions/sd-webui-text2video/scripts\modelscope\t2v_pipeline.py", line 363, in get_conds_with_caching cache[1] = function(model, required_prompts, steps) File "E:\automatic\modules\prompt_parser.py", line 137, in get_learned_conditioning conds = model.get_learned_conditioning(texts) File "E:\automatic\extensions\sd-webui-text2video\scripts\modelscope\clip_hardcode.py", line 272, in get_learned_conditioning return self.encode(text) File "E:\automatic\extensions\sd-webui-text2video\scripts\modelscope\clip_hardcode.py", line 269, in encode return self(text) File "E:\automatic\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "E:\automatic\extensions\sd-webui-text2video\scripts\modelscope\clip_hardcode.py", line 363, in forward batch_chunks, token_count = self.process_texts(texts) File "E:\automatic\extensions\sd-webui-text2video\scripts\modelscope\clip_hardcode.py", line 247, in process_texts chunks, current_token_count = self.tokenize_line(line) File "E:\automatic\extensions\sd-webui-text2video\scripts\modelscope\clip_hardcode.py", line 150, in tokenizeline tokenized = self.tokenize([text for text, in parsed]) File "E:\automatic\extensions\sd-webui-text2video\scripts\modelscope\clip_hardcode.py", line 96, in tokenize assert not opts.use_old_emphasis_implementation, 'Old emphasis implementation not supported for Open Clip' File "E:\automatic\modules\shared.py", line 484, in getattr return super(Options, self).getattribute(item) AttributeError: 'Options' object has no attribute 'use_old_emphasis_implementation' Exception occurred: 'Options' object has no attribute 'use_old_emphasis_implementation'


Also Video Crafter works at least, but it doesn't take settings into account, just runs something default, except for prompt (you already have an issue for this though)

Steps to reproduce the problem

  1. Go to ....
  2. Press ....
  3. ...

What should have happened?

No response

WebUI and Deforum extension Commit IDs

webui commit id - Version: 09a14802 Mon Apr 17 15:52:26 2023 -0400 9 (it's a fork https://github.com/vladmandic/automatic, so not sure if applicable to main) txt2vid commit id - Git commit: 9b79cb8d (Sun Apr 16 12:17:20 2023)

What GPU were you using for launching?

3060 12G

On which platform are you launching the webui backend with the extension?

Local PC setup (Windows)

Settings

Screenshot (160)

Console logs

text2video — The model selected is:  ModelScope
 text2video extension for auto1111 webui
Git commit: 9b79cb8d (Sun Apr 16 12:17:20 2023)
Starting text2video
Pipeline setup
config namespace(framework='pytorch', task='text-to-video-synthesis', model={'type': 'latent-text-to-video-synthesis', 'model_args': {'ckpt_clip': 'open_clip_pytorch_model.bin', 'ckpt_unet': 'text2video_pytorch_model.pth', 'ckpt_autoencoder': 'VQGAN_autoencoder.pth', 'max_frames': 16, 'tiny_gpu': 1}, 'model_cfg': {'unet_in_dim': 4, 'unet_dim': 320, 'unet_y_dim': 768, 'unet_context_dim': 1024, 'unet_out_dim': 4, 'unet_dim_mult': [1, 2, 4, 4], 'unet_num_heads': 8, 'unet_head_dim': 64, 'unet_res_blocks': 2, 'unet_attn_scales': [1, 0.5, 0.25], 'unet_dropout': 0.1, 'temporal_attention': 'True', 'num_timesteps': 1000, 'mean_type': 'eps', 'var_type': 'fixed_small', 'loss_type': 'mse'}}, pipeline={'type': 'latent-text-to-video-synthesis'})
device cuda
Working in txt2vid mode
  0%|                                                    | 0/1 [00:00<?, ?it/s]Traceback (most recent call last):
  File "E:\automatic/extensions/sd-webui-text2video/scripts\t2v_helpers\render.py", line 24, in run
    vids_pack = process_modelscope(args_dict)
  File "E:\automatic/extensions/sd-webui-text2video/scripts\modelscope\process_modelscope.py", line 145, in process_modelscope
    print("Received an image for inpainting", args.inpainting_image.name)
AttributeError: 'NoneType' object has no attribute 'name'
Exception occurred: 'NoneType' object has no attribute 'name'

Additional information

No response

rbfussell commented 1 year ago

This is a multipart but luckily same line issue, first the a problem with the img2vid inpainting frams>0 with no image loaded.

inpainting frames = 0

 text2video extension for auto1111 webui
Git commit: 7a5504b4 (Sat Apr 15 15:03:37 2023)
Starting text2video
Pipeline setup
config namespace(framework='pytorch', task='text-to-video-synthesis', model={'type': 'latent-text-to-video-synthesis', 'model_args': {'ckpt_clip': 'open_clip_pytorch_model.bin', 'ckpt_unet': 'text2video_pytorch_model.pth', 'ckpt_autoencoder': 'VQGAN_autoencoder.pth', 'max_frames': 16, 'tiny_gpu': 1}, 'model_cfg': {'unet_in_dim': 4, 'unet_dim': 320, 'unet_y_dim': 768, 'unet_context_dim': 1024, 'unet_out_dim': 4, 'unet_dim_mult': [1, 2, 4, 4], 'unet_num_heads': 8, 'unet_head_dim': 64, 'unet_res_blocks': 2, 'unet_attn_scales': [1, 0.5, 0.25], 'unet_dropout': 0.1, 'temporal_attention': 'True', 'num_timesteps': 1000, 'mean_type': 'eps', 'var_type': 'fixed_small', 'loss_type': 'mse'}}, pipeline={'type': 'latent-text-to-video-synthesis'})
device cuda
Working in txt2vid mode
  0%|                                                                                                                                                                          | 0/1 [00:00<?, ?it/s]latents torch.Size([1, 4, 24, 32, 32]) tensor(0.0027, device='cuda:0') tensor(1.0003, device='cuda:0')
DDIM sampling tensor(1): 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 17/17 [00:21<00:00,  1.26s/it]
STARTING VAE ON GPU. 24 CHUNKS TO PROCESS████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 17/17 [00:21<00:00,  1.19s/it]
VAE HALVED
DECODING FRAMES
VAE FINISHED
torch.Size([24, 3, 256, 256])
output/mp4s/20230420_070131540593.mp4
text2video finished, saving frames to E:\Projects\AI\sd.webui\webui\outputs/img2img-images\text2video\20230420065941
Got a request to stitch frames to video using FFmpeg.
Frames:
E:\Projects\AI\sd.webui\webui\outputs/img2img-images\text2video\20230420065941\%06d.png
To Video:
E:\Projects\AI\sd.webui\webui\outputs/img2img-images\text2video\20230420065941\vid.mp4
Stitching *video*...
Stitching *video*...
Video stitching done in 0.64 seconds!
t2v complete, result saved at E:\Projects\AI\sd.webui\webui\outputs/img2img-images\text2video\20230420065941

inpainting frames set to >0
image

text2video — The model selected is:  ModelScope
 text2video extension for auto1111 webui
Git commit: 7a5504b4 (Sat Apr 15 15:03:37 2023)
Starting text2video
Pipeline setup
device cuda
Working in txt2vid mode
  0%|                                                                                                                                                                          | 0/1 [00:00<?, ?it/s]gir None
Traceback (most recent call last):
  File "E:\Projects\AI\sd.webui\webui\extensions\sd-webui-text2video\scripts\text2vid.py", line 96, in process
    process_modelscope(skip_video_creation, ffmpeg_location, ffmpeg_crf, ffmpeg_preset, fps, add_soundtrack, soundtrack_path, \
  File "E:\Projects\AI\sd.webui\webui\extensions\sd-webui-text2video\scripts\text2vid.py", line 249, in process_modelscope
    print(inpainting_image.name)
AttributeError: 'NoneType' object has no attribute 'name'
Exception occurred: 'NoneType' object has no attribute 'name'

since you have no image and inpainting_frames > 0 , it assumes args.inpainting_image.name exists, but without an image in place the attribute is not present.


the second was that inpainting_frames should be args.inpainting_frames.

https://github.com/deforum-art/sd-webui-text2video/pull/127 this pull should fix if merged.

kabachuha commented 1 year ago

Closing as the PR was merged