kabachuha / sd-webui-text2video

Auto1111 extension implementing text2video diffusion models (like ModelScope or VideoCrafter) using only Auto1111 webui dependencies
Other
1.28k stars 106 forks source link

Fix[122b] fixes 3rd reported exception found in #122 #129

Closed rbfussell closed 1 year ago

rbfussell commented 1 year ago

clip_hardcode missing opts.use_old_emphasis_implementation

check for it to be present before checking its value. Depending on the nature of this attribute, it may be better to press a fix into the Options class, since it seems to be not present in all cases true and false, if it needs to be. I don't actually know the function of this, so not sure if this patch will be a preferable fix to disregard no attribute. I assume it is since a false / undefined seem they would be analogous, albeit nonetheless unexpected behavior to be missing.

This is simply a pull to finalize closing of #122 as #fix-122b Which #127 addressed two cases of, but upon reviewing I realized he made 3 different listings for errors.

RE: #122 --- text2video — The model selected is: ModelScope text2video extension for auto1111 webui Git commit: https://github.com/deforum-art/sd-webui-text2video/commit/9b79cb8d3ab44de883c5ffafd89dd708f251458a (Sun Apr 16 12:17:20 2023) Starting text2video Pipeline setup config namespace(framework='pytorch', task='text-to-video-synthesis', model={'type': 'latent-text-to-video-synthesis', 'model_args': {'ckpt_clip': 'open_clip_pytorch_model.bin', 'ckpt_unet': 'text2video_pytorch_model.pth', 'ckpt_autoencoder': 'VQGAN_autoencoder.pth', 'max_frames': 16, 'tiny_gpu': 1}, 'model_cfg': {'unet_in_dim': 4, 'unet_dim': 320, 'unet_y_dim': 768, 'unet_context_dim': 1024, 'unet_out_dim': 4, 'unet_dim_mult': [1, 2, 4, 4], 'unet_num_heads': 8, 'unet_head_dim': 64, 'unet_res_blocks': 2, 'unet_attn_scales': [1, 0.5, 0.25], 'unet_dropout': 0.1, 'temporal_attention': 'True', 'num_timesteps': 1000, 'mean_type': 'eps', 'var_type': 'fixed_small', 'loss_type': 'mse'}}, pipeline={'type': 'latent-text-to-video-synthesis'}) device cuda Working in txt2vid mode 0%| | 0/1 [00:00<?, ?it/s]Received an image for inpainting C:\Users\barle\AppData\Local\Temp\8e20ccecaa430e3168e76daa5a09d596b98502be\00005-3882147318.png Converted the frames to tensor (1, 48, 3, 512, 512) Computing latents STARTING VAE ON GPU VAE HALVED Traceback (most recent call last): File "E:\automatic/extensions/sd-webui-text2video/scripts\t2v_helpers\render.py", line 24, in run vids_pack = process_modelscope(args_dict) File "E:\automatic/extensions/sd-webui-text2video/scripts\modelscope\process_modelscope.py", line 193, in process_modelscope samples, _ = pipe.infer(args.prompt, args.n_prompt, args.steps, args.frames, args.seed + batch if args.seed != -1 else -1, args.cfg_scale, File "E:\automatic/extensions/sd-webui-text2video/scripts\modelscope\t2v_pipeline.py", line 216, in infer c, uc = self.preprocess(prompt, n_prompt, steps) File "E:\automatic/extensions/sd-webui-text2video/scripts\modelscope\t2v_pipeline.py", line 370, in preprocess uc = get_conds_with_caching(prompt_parser.get_learned_conditioning, self.clip_encoder, [n_prompt], steps, cached_uc) File "E:\automatic/extensions/sd-webui-text2video/scripts\modelscope\t2v_pipeline.py", line 363, in get_conds_with_caching cache[1] = function(model, required_prompts, steps) File "E:\automatic\modules\prompt_parser.py", line 137, in get_learned_conditioning conds = model.get_learned_conditioning(texts) File "E:\automatic\extensions\sd-webui-text2video\scripts\modelscope\clip_hardcode.py", line 272, in get_learned_conditioning return self.encode(text) File "E:\automatic\extensions\sd-webui-text2video\scripts\modelscope\clip_hardcode.py", line 269, in encode return self(text) File "E:\automatic\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "E:\automatic\extensions\sd-webui-text2video\scripts\modelscope\clip_hardcode.py", line 363, in forward batch_chunks, token_count = self.process_texts(texts) File "E:\automatic\extensions\sd-webui-text2video\scripts\modelscope\clip_hardcode.py", line 247, in process_texts chunks, current_token_count = self.tokenize_line(line) File "E:\automatic\extensions\sd-webui-text2video\scripts\modelscope\clip_hardcode.py", line 150, in tokenize_line tokenized = self.tokenize([text for text, _ in parsed]) File "E:\automatic\extensions\sd-webui-text2video\scripts\modelscope\clip_hardcode.py", line 96, in tokenize assert not opts.use_old_emphasis_implementation, 'Old emphasis implementation not supported for Open Clip' File "E:\automatic\modules\shared.py", line 484, in getattr return super(Options, self).getattribute(item) AttributeError: 'Options' object has no attribute 'use_old_emphasis_implementation' Exception occurred: 'Options' object has no attribute 'use_old_emphasis_implementation'