kabachuha / sd-webui-text2video

Auto1111 extension implementing text2video diffusion models (like ModelScope or VideoCrafter) using only Auto1111 webui dependencies
Other
1.29k stars 108 forks source link

error using text2vid with CPU #82

Open system3600 opened 1 year ago

system3600 commented 1 year ago

Is there an existing issue for this?

Are you using the latest version of the extension?

What happened?

raise RuntimeError('Attempting to deserialize object on a CUDA ' RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU. Exception occurred: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

I get this error every time I use the extension I get this error, can anyone help me?

Steps to reproduce the problem

  1. open the web UI (with no Nvidia card - CPU)
  2. go to text2vid tab
  3. try to generete a video

What should have happened?

No response

WebUI and Deforum extension Commit IDs

webui commit id - 41d5450a0a559469b354fd56cac0fe1ab3cf2f40 txt2vid commit id - 6e2f2a816ae79ae996379a359414ae2e61222baf

What GPU were you using for launching?

CPU (i only have a AMD card), 12 GB of ram

On which platform are you launching the webui backend with the extension?

Local PC setup (Windows)

Settings

image

Console logs

C:\Users\Administrator\Desktop\SD\stable-diffusion-webui-directml>git pull
Already up to date.
venv "C:\Users\Administrator\Desktop\SD\stable-diffusion-webui-directml\venv\Scripts\Python.exe"
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Commit hash: 41d5450a0a559469b354fd56cac0fe1ab3cf2f40
Installing requirements for Web UI
Installing None
Installing onnxruntime-gpu...
Installing None
Installing opencv-python...
Installing None
Installing Pillow...

Launching Web UI with arguments: --opt-sub-quad-attention --medvram --api --disable-safe-unpickle --disable-nan-check --lora-dir D:\Lora --ckpt-dir E:\Modelos IA --no-half --precision autocast --administrator
Interrogations are fallen back to cpu. This doesn't affect on image generation. But if you want to use interrogate (CLIP or DeepBooru), check out this issue: https://github.com/lshqqytiger/stable-diffusion-webui-directml/issues/10
Warning: caught exception 'Torch not compiled with CUDA enabled', memory monitor disabled
No module 'xformers'. Proceeding without it.
2023-03-28 20:16:59.1400392 [E:onnxruntime:Default, provider_bridge_ort.cc:1304 onnxruntime::TryGetProviderInfo_CUDA] D:\a\_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1106 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126 "" when trying to load "C:\Users\Administrator\Desktop\SD\stable-diffusion-webui-directml\venv\lib\site-packages\onnxruntime\capi\onnxruntime_providers_cuda.dll"

2023-03-28 20:16:59.1471301 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:541 onnxruntime::python::CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/reference/execution-providers/CUDA-ExecutionProvider.html#requirements to ensure all dependencies are met.
Hypernetwork-MonkeyPatch-Extension not found
[text2prompt] Following databases are available:
    all-mpnet-base-v2 : danbooru_strict
Loading weights [546d287d2f] from C:\Users\Administrator\Desktop\SD\stable-diffusion-webui-directml\models\Stable-diffusion\(ANIME - USE ESSE - incrivel - NSFW) fandermixPlus_v14.safetensors
Creating model from config: C:\Users\Administrator\Desktop\SD\stable-diffusion-webui-directml\configs\v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Applying sub-quadratic cross attention optimization.
Textual inversion embeddings loaded(32): ( plano de fundo de anime) anime-background-style-v2, (adiciona chamas em volta do personagem) flame_surge_style, (adiciona globins) Style-Goblinmode, (adiciona tentaculos eroticos) corneo_tentacle_sex, (adiciona uma bomba nuclear) emb-nuke2, (baixos poligonos) poly-hd, (cria brasões com escudo) logo-with-face-on-shield, (deixa o gozo mais real - NSFW) realcumAI, (deixa tudo em estilo de desenhos antigos com linhas grossas) durer-style, (deixa tudo magico) style-sylvamagic, (estilho desenhado a mão e antigo - INCRIVEL )sd15_journalSketch, (estilo abstrato para historias misticas) fairy-tale-painting-style, (estilo animação abstrata) SamDoesArt1, (estilo de inverno) Style-Winter, (estilo manga preto e branco) rfktr_bwmnga, (estilo mistico e colorido) Style-NebMagic, (estilo morto - necromancia) Style-Necromancy, (estilo sombrio) tarot512, (fungos the last of us) tloustyle, (infograficos) Style-Info, (pintura chinesa tradicional) _shuimo_, (pixel art incrivelmente detalhada) art by EMB_skstest3, (torna os personagens em psicopatas ) Style-Psycho, (transforma animes em vampiros) vmpr, (transforma em anime POP - INCRIVEL) MakeItPopVA, (transforma todos em herois) hro, (transforma tudo em bosses) bsft, (tudo em estilo GTA5) gta5-artwork, (tudo em revistas porno antigas) pervpulp15, (util para ambientes de aventura) advntr, (util para artes conceituais) concept-art, (util para sexo anal) corneo_anal
Textual inversion embeddings skipped(14): (cria naves espaciais) DaveSpaceFour, (deixa tudo de forma que parece que foi desenhado a mão) InkPunk768, (deixa tudo em rascunho) UlukInkSketch2, (estilo apocalipse gore) Apocofy, (estilo de pintura) PaintStyle3, (estilo minjorney - abstrato- INCRIVEL)rzminjourney, (grandes construções futuristicas) Chadeisson, (livros antigos e artes sinistras)ParchArt, (muito bom para monstros e terror) ScaryMonstersV2, (transforma em pixel art) pixelart-1, (tudo dentro de vidraças) kc16-v1-4000, (tudo em estado apocaliptico) Apoc768, (tudo em um coração) HeartArt, (zumbies gore e terror) hellscape768
Model loaded in 8.7s (load weights from disk: 0.5s, create model: 0.6s, apply weights to model: 6.8s, load VAE: 0.2s, load textual inversion embeddings: 0.4s).
[text2prompt] Loading database with name "all-mpnet-base-v2 : danbooru_strict"...
[text2prompt] Database loaded
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 40.8s (import torch: 5.6s, import gradio: 0.9s, import ldm: 0.9s, other imports: 1.3s, list extensions: 3.6s, list SD models: 5.6s, setup codeformer: 0.1s, load scripts: 5.1s, load SD checkpoint: 9.0s, create ui: 8.4s, gradio launch: 0.2s, scripts app_started_callback: 0.1s).
ModelScope text2video extension for auto1111 webui
Git commit: 6e2f2a81 (Tue Mar 28 13:23:42 2023)
Starting text2video
Pipeline setup
config namespace(framework='pytorch', task='text-to-video-synthesis', model={'type': 'latent-text-to-video-synthesis', 'model_args': {'ckpt_clip': 'open_clip_pytorch_model.bin', 'ckpt_unet': 'text2video_pytorch_model.pth', 'ckpt_autoencoder': 'VQGAN_autoencoder.pth', 'max_frames': 16, 'tiny_gpu': 1}, 'model_cfg': {'unet_in_dim': 4, 'unet_dim': 320, 'unet_y_dim': 768, 'unet_context_dim': 1024, 'unet_out_dim': 4, 'unet_dim_mult': [1, 2, 4, 4], 'unet_num_heads': 8, 'unet_head_dim': 64, 'unet_res_blocks': 2, 'unet_attn_scales': [1, 0.5, 0.25], 'unet_dropout': 0.1, 'temporal_attention': 'True', 'num_timesteps': 1000, 'mean_type': 'eps', 'var_type': 'fixed_small', 'loss_type': 'mse'}}, pipeline={'type': 'latent-text-to-video-synthesis'})
Traceback (most recent call last):
  File "C:\Users\Administrator\Desktop\SD\stable-diffusion-webui-directml\extensions\sd-webui-modelscope-text2video\scripts\modelscope-text2vid.py", line 75, in process
    pipe = setup_pipeline()
  File "C:\Users\Administrator\Desktop\SD\stable-diffusion-webui-directml\extensions\sd-webui-modelscope-text2video\scripts\modelscope-text2vid.py", line 30, in setup_pipeline
    return TextToVideoSynthesis(ph.models_path+'/ModelScope/t2v')
  File "C:\Users\Administrator\Desktop\SD\stable-diffusion-webui-directml\extensions\sd-webui-modelscope-text2video\scripts\t2v_pipeline.py", line 84, in __init__
    torch.load(
  File "C:\Users\Administrator\Desktop\SD\stable-diffusion-webui-directml\modules\safe.py", line 106, in load
    return load_with_extra(filename, extra_handler=global_extra_handler, *args, **kwargs)
  File "C:\Users\Administrator\Desktop\SD\stable-diffusion-webui-directml\modules\safe.py", line 151, in load_with_extra
    return unsafe_torch_load(filename, *args, **kwargs)
  File "C:\Users\Administrator\Desktop\SD\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 789, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "C:\Users\Administrator\Desktop\SD\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 1131, in _load
    result = unpickler.load()
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\pickle.py", line 1213, in load
    dispatch[key[0]](self)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\pickle.py", line 1254, in load_binpersid
    self.append(self.persistent_load(pid))
  File "C:\Users\Administrator\Desktop\SD\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 1101, in persistent_load
    load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
  File "C:\Users\Administrator\Desktop\SD\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 1083, in load_tensor
    wrap_storage=restore_location(storage, location),
  File "C:\Users\Administrator\Desktop\SD\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 215, in default_restore_location
    result = fn(storage, location)
  File "C:\Users\Administrator\Desktop\SD\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 182, in _cuda_deserialize
    device = validate_cuda_device(location)
  File "C:\Users\Administrator\Desktop\SD\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 166, in validate_cuda_device
    raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
Exception occurred: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

Additional information

I'm using CPU mode, the only gpu I have is from AMD and it's not supported (as far as I know about text2vid).

FreeBlues commented 1 year ago

The same issue.

Windows10 GPU: AMD RX6600 8G

On setting, I select CPU-Only

error log:

To create a public link, set `share=True` in `launch()`.
Startup time: 0.8s (load scripts: 0.2s, create ui: 0.5s).
text2video — The model selected is:  ModelScope
 text2video extension for auto1111 webui
Git commit: 4fea1ada (Sun Apr 23 10:39:51 2023)
Starting text2video
Pipeline setup
config namespace(framework='pytorch', task='text-to-video-synthesis', model={'type': 'latent-text-to-video-synthesis', 'model_args': {'ckpt_clip': 'open_clip_pytorch_model.bin', 'ckpt_unet': 'text2video_pytorch_model.pth', 'ckpt_autoencoder': 'VQGAN_autoencoder.pth', 'max_frames': 16, 'tiny_gpu': 1}, 'model_cfg': {'unet_in_dim': 4, 'unet_dim': 320, 'unet_y_dim': 768, 'unet_context_dim': 1024, 'unet_out_dim': 4, 'unet_dim_mult': [1, 2, 4, 4], 'unet_num_heads': 8, 'unet_head_dim': 64, 'unet_res_blocks': 2, 'unet_attn_scales': [1, 0.5, 0.25], 'unet_dropout': 0.1, 'temporal_attention': 'True', 'num_timesteps': 1000, 'mean_type': 'eps', 'var_type': 'fixed_small', 'loss_type': 'mse'}}, pipeline={'type': 'latent-text-to-video-synthesis'})
Traceback (most recent call last):
  File "E:\Github\stable-diffusion-webui-directml/extensions/sd-webui-text2video/scripts\t2v_helpers\render.py", line 24, in run
    vids_pack = process_modelscope(args_dict)
  File "E:\Github\stable-diffusion-webui-directml/extensions/sd-webui-text2video/scripts\modelscope\process_modelscope.py", line 55, in process_modelscope
    pipe = setup_pipeline()
  File "E:\Github\stable-diffusion-webui-directml/extensions/sd-webui-text2video/scripts\modelscope\process_modelscope.py", line 26, in setup_pipeline
    return TextToVideoSynthesis(ph.models_path+'/ModelScope/t2v')
  File "E:\Github\stable-diffusion-webui-directml/extensions/sd-webui-text2video/scripts\modelscope\t2v_pipeline.py", line 86, in __init__
    torch.load(
  File "E:\Github\stable-diffusion-webui-directml\modules\safe.py", line 106, in load
    return load_with_extra(filename, extra_handler=global_extra_handler, *args, **kwargs)
  File "E:\Github\stable-diffusion-webui-directml\modules\safe.py", line 151, in load_with_extra
    return unsafe_torch_load(filename, *args, **kwargs)
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 789, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 1131, in _load
    result = unpickler.load()
  File "E:\Users\admin\AppData\Local\Programs\Python\Python310\lib\pickle.py", line 1213, in load
    dispatch[key[0]](self)
  File "E:\Users\admin\AppData\Local\Programs\Python\Python310\lib\pickle.py", line 1254, in load_binpersid
    self.append(self.persistent_load(pid))
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 1101, in persistent_load
    load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 1083, in load_tensor
    wrap_storage=restore_location(storage, location),
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 215, in default_restore_location
    result = fn(storage, location)
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 182, in _cuda_deserialize
    device = validate_cuda_device(location)
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 166, in validate_cuda_device
    raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
Exception occurred: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
FreeBlues commented 1 year ago

I found the 3 modes all cannt work:

error logs of GPU(half precision) and GPU

E:\Github\stable-diffusion-webui-directml>webui.bat
venv "E:\Github\stable-diffusion-webui-directml\venv\Scripts\Python.exe"
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec  6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Commit hash: fd59537df9420bb14c1d6330ec59e30ce870a481
Installing requirements for Web UI

Launching Web UI with arguments: --opt-split-attention-v1 --medvram --disable-nan-check
Interrogations are fallen back to cpu. This doesn't affect on image generation. But if you want to use interrogate (CLIP or DeepBooru), check out this issue: https://github.com/lshqqytiger/stable-diffusion-webui-directml/issues/10
Warning: caught exception 'Torch not compiled with CUDA enabled', memory monitor disabled
No module 'xformers'. Proceeding without it.
Loading weights [7234b76e42] from E:\Github\stable-diffusion-webui-directml\models\Stable-diffusion\chilloutmix_Ni.safetensors
Creating model from config: E:\Github\stable-diffusion-webui-directml\configs\v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Applying v1 cross attention optimization.
Textual inversion embeddings loaded(0):
Model loaded in 1.8s (load weights from disk: 0.4s, create model: 0.3s, apply weights to model: 0.5s, apply half(): 0.6s).
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 7.3s (import torch: 1.3s, import gradio: 0.8s, import ldm: 0.4s, other imports: 0.9s, setup codeformer: 0.2s, load scripts: 1.1s, load SD checkpoint: 2.0s, create ui: 0.6s).
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:24<00:00,  1.23s/it]
Total progress: 100%|██████████████████████████████████████████████████████████████████| 20/20 [00:19<00:00,  1.01it/s]
text2video — The model selected is:  ModelScope████████████████████████████████████████| 20/20 [00:19<00:00,  1.15it/s]
 text2video extension for auto1111 webui
Git commit: 4fea1ada (Sun Apr 23 10:39:51 2023)
Starting text2video
Pipeline setup
config namespace(framework='pytorch', task='text-to-video-synthesis', model={'type': 'latent-text-to-video-synthesis', 'model_args': {'ckpt_clip': 'open_clip_pytorch_model.bin', 'ckpt_unet': 'text2video_pytorch_model.pth', 'ckpt_autoencoder': 'VQGAN_autoencoder.pth', 'max_frames': 16, 'tiny_gpu': 1}, 'model_cfg': {'unet_in_dim': 4, 'unet_dim': 320, 'unet_y_dim': 768, 'unet_context_dim': 1024, 'unet_out_dim': 4, 'unet_dim_mult': [1, 2, 4, 4], 'unet_num_heads': 8, 'unet_head_dim': 64, 'unet_res_blocks': 2, 'unet_attn_scales': [1, 0.5, 0.25], 'unet_dropout': 0.1, 'temporal_attention': 'True', 'num_timesteps': 1000, 'mean_type': 'eps', 'var_type': 'fixed_small', 'loss_type': 'mse'}}, pipeline={'type': 'latent-text-to-video-synthesis'})
Traceback (most recent call last):
  File "E:\Github\stable-diffusion-webui-directml/extensions/sd-webui-text2video/scripts\t2v_helpers\render.py", line 24, in run
    vids_pack = process_modelscope(args_dict)
  File "E:\Github\stable-diffusion-webui-directml/extensions/sd-webui-text2video/scripts\modelscope\process_modelscope.py", line 55, in process_modelscope
    pipe = setup_pipeline()
  File "E:\Github\stable-diffusion-webui-directml/extensions/sd-webui-text2video/scripts\modelscope\process_modelscope.py", line 26, in setup_pipeline
    return TextToVideoSynthesis(ph.models_path+'/ModelScope/t2v')
  File "E:\Github\stable-diffusion-webui-directml/extensions/sd-webui-text2video/scripts\modelscope\t2v_pipeline.py", line 86, in __init__
    torch.load(
  File "E:\Github\stable-diffusion-webui-directml\modules\safe.py", line 106, in load
    return load_with_extra(filename, extra_handler=global_extra_handler, *args, **kwargs)
  File "E:\Github\stable-diffusion-webui-directml\modules\safe.py", line 151, in load_with_extra
    return unsafe_torch_load(filename, *args, **kwargs)
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 789, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 1131, in _load
    result = unpickler.load()
  File "E:\Users\admin\AppData\Local\Programs\Python\Python310\lib\pickle.py", line 1213, in load
    dispatch[key[0]](self)
  File "E:\Users\admin\AppData\Local\Programs\Python\Python310\lib\pickle.py", line 1254, in load_binpersid
    self.append(self.persistent_load(pid))
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 1101, in persistent_load
    load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 1083, in load_tensor
    wrap_storage=restore_location(storage, location),
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 215, in default_restore_location
    result = fn(storage, location)
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 182, in _cuda_deserialize
    device = validate_cuda_device(location)
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 166, in validate_cuda_device
    raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
Exception occurred: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
Interrupted with signal 2 in <frame at 0x00000273FBD6A500, file 'E:\\Github\\stable-diffusion-webui-directml\\webui.py', line 209, code wait_on_server>
终止批处理操作吗(Y/N)? y

E:\Github\stable-diffusion-webui-directml>webui.bat
venv "E:\Github\stable-diffusion-webui-directml\venv\Scripts\Python.exe"
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec  6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Commit hash: fd59537df9420bb14c1d6330ec59e30ce870a481
Installing requirements for Web UI

Launching Web UI with arguments: --opt-split-attention-v1 --medvram --disable-nan-check
Interrogations are fallen back to cpu. This doesn't affect on image generation. But if you want to use interrogate (CLIP or DeepBooru), check out this issue: https://github.com/lshqqytiger/stable-diffusion-webui-directml/issues/10
Warning: caught exception 'Torch not compiled with CUDA enabled', memory monitor disabled
No module 'xformers'. Proceeding without it.
Loading weights [7234b76e42] from E:\Github\stable-diffusion-webui-directml\models\Stable-diffusion\chilloutmix_Ni.safetensors
Creating model from config: E:\Github\stable-diffusion-webui-directml\configs\v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Applying v1 cross attention optimization.
Textual inversion embeddings loaded(0):
Model loaded in 1.7s (load weights from disk: 0.4s, create model: 0.3s, apply weights to model: 0.5s, apply half(): 0.5s).
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 7.2s (import torch: 1.3s, import gradio: 0.8s, import ldm: 0.4s, other imports: 0.9s, setup codeformer: 0.2s, load scripts: 1.1s, load SD checkpoint: 1.9s, create ui: 0.5s, gradio launch: 0.2s).
text2video — The model selected is:  ModelScope
 text2video extension for auto1111 webui
Git commit: 4fea1ada (Sun Apr 23 10:39:51 2023)
Starting text2video
Pipeline setup
config namespace(framework='pytorch', task='text-to-video-synthesis', model={'type': 'latent-text-to-video-synthesis', 'model_args': {'ckpt_clip': 'open_clip_pytorch_model.bin', 'ckpt_unet': 'text2video_pytorch_model.pth', 'ckpt_autoencoder': 'VQGAN_autoencoder.pth', 'max_frames': 16, 'tiny_gpu': 1}, 'model_cfg': {'unet_in_dim': 4, 'unet_dim': 320, 'unet_y_dim': 768, 'unet_context_dim': 1024, 'unet_out_dim': 4, 'unet_dim_mult': [1, 2, 4, 4], 'unet_num_heads': 8, 'unet_head_dim': 64, 'unet_res_blocks': 2, 'unet_attn_scales': [1, 0.5, 0.25], 'unet_dropout': 0.1, 'temporal_attention': 'True', 'num_timesteps': 1000, 'mean_type': 'eps', 'var_type': 'fixed_small', 'loss_type': 'mse'}}, pipeline={'type': 'latent-text-to-video-synthesis'})
Traceback (most recent call last):
  File "E:\Github\stable-diffusion-webui-directml/extensions/sd-webui-text2video/scripts\t2v_helpers\render.py", line 24, in run
    vids_pack = process_modelscope(args_dict)
  File "E:\Github\stable-diffusion-webui-directml/extensions/sd-webui-text2video/scripts\modelscope\process_modelscope.py", line 55, in process_modelscope
    pipe = setup_pipeline()
  File "E:\Github\stable-diffusion-webui-directml/extensions/sd-webui-text2video/scripts\modelscope\process_modelscope.py", line 26, in setup_pipeline
    return TextToVideoSynthesis(ph.models_path+'/ModelScope/t2v')
  File "E:\Github\stable-diffusion-webui-directml/extensions/sd-webui-text2video/scripts\modelscope\t2v_pipeline.py", line 86, in __init__
    torch.load(
  File "E:\Github\stable-diffusion-webui-directml\modules\safe.py", line 106, in load
    return load_with_extra(filename, extra_handler=global_extra_handler, *args, **kwargs)
  File "E:\Github\stable-diffusion-webui-directml\modules\safe.py", line 151, in load_with_extra
    return unsafe_torch_load(filename, *args, **kwargs)
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 789, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 1131, in _load
    result = unpickler.load()
  File "E:\Users\admin\AppData\Local\Programs\Python\Python310\lib\pickle.py", line 1213, in load
    dispatch[key[0]](self)
  File "E:\Users\admin\AppData\Local\Programs\Python\Python310\lib\pickle.py", line 1254, in load_binpersid
    self.append(self.persistent_load(pid))
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 1101, in persistent_load
    load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 1083, in load_tensor
    wrap_storage=restore_location(storage, location),
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 215, in default_restore_location
    result = fn(storage, location)
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 182, in _cuda_deserialize
    device = validate_cuda_device(location)
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\torch\serialization.py", line 166, in validate_cuda_device
    raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
Exception occurred: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
FreeBlues commented 1 year ago

After the text2vid failure, I try to use txt2img and got error

error log:

 running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
Traceback (most recent call last):
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\routes.py", line 394, in run_predict
    output = await app.get_blocks().process_api(
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\blocks.py", line 1075, in process_api
    result = await self.call_function(
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\blocks.py", line 884, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "E:\Github\stable-diffusion-webui-directml\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "E:\Github\stable-diffusion-webui-directml\modules\call_queue.py", line 15, in f
    res = func(*args, **kwargs)
  File "E:\Github\stable-diffusion-webui-directml\modules\ui.py", line 265, in update_token_counter
    token_count, max_length = max([model_hijack.get_prompt_lengths(prompt) for prompt in prompts], key=lambda args: args[0])
  File "E:\Github\stable-diffusion-webui-directml\modules\ui.py", line 265, in <listcomp>
    token_count, max_length = max([model_hijack.get_prompt_lengths(prompt) for prompt in prompts], key=lambda args: args[0])
  File "E:\Github\stable-diffusion-webui-directml\modules\sd_hijack.py", line 219, in get_prompt_lengths
    _, token_count = self.clip.process_texts([text])
AttributeError: 'NoneType' object has no attribute 'process_texts'
Error running process: E:\Github\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\scripts\controlnet.py
Traceback (most recent call last):
  File "E:\Github\stable-diffusion-webui-directml\modules\scripts.py", line 417, in process
    script.process(p, *script_args)
  File "E:\Github\stable-diffusion-webui-directml\extensions\sd-webui-controlnet\scripts\controlnet.py", line 614, in process
    unet = p.sd_model.model.diffusion_model
AttributeError: 'NoneType' object has no attribute 'model'

Error completing request
Arguments: ('task(r7navcdy3i407cw)', 'a cat', '', [], 20, 0, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, [], 0, <scripts.external_code.ControlNetUnit object at 0x00000238886C1B70>, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0, None, False, 50) {}
Traceback (most recent call last):
  File "E:\Github\stable-diffusion-webui-directml\modules\call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "E:\Github\stable-diffusion-webui-directml\modules\call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "E:\Github\stable-diffusion-webui-directml\modules\txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "E:\Github\stable-diffusion-webui-directml\modules\processing.py", line 503, in process_images
    res = process_images_inner(p)
  File "E:\Github\stable-diffusion-webui-directml\modules\processing.py", line 592, in process_images_inner
    with torch.no_grad(), p.sd_model.ema_scope():
AttributeError: 'NoneType' object has no attribute 'ema_scope'
GoZippy commented 1 year ago

Any update on this? I have RX6700 XT AMD GPU and CPU - cannot get it to init with CPU or GPU. Wondering if it is related to pytorch or what... ?

FWIW I am using tiger's fork of Automatic1111 for AMD https://github.com/lshqqytiger/stable-diffusion-webui-directml

Need a work around - would love to be playing with txt 2 vid native on AMD RDNA2/3 but CPU init should be enough to get started wit the directml wrap - right?

TheSloppiestOfJoes commented 1 year ago

I found a work-around to force use on the CPU, but I still get an error about half-precision (haven't been able to find where that's happening yet).

I changed the following line of code in the t2v_pipeline.py file from:

            torch.load(
                osp.join(self.model_dir, self.config.model["model_args"]["ckpt_unet"]),
                map_location='cpu' if devices.has_mps() else None, # default to cpu when macos, else default behaviour
            ),

to:

            torch.load(
                osp.join(self.model_dir, self.config.model["model_args"]["ckpt_unet"]),
                map_location='cpu' # if devices.has_mps() else None, # default to cpu when macos, else default behaviour
            ),

and:

        self.sd_model.eval()
        if not devices.has_mps():
            self.sd_model.float()

to:

        self.sd_model.eval()
        if 1==1 : # not devices.has_mps():
            self.sd_model.float()

I am not a regular python user, so this code may not be implemented properly. I believe this will still result in the half-precision issue, but let me know if it doesn't for you.

system3600 commented 1 year ago

@TheSloppiestOfJoes Your changes seem to have worked very well, so well that the modified model that uses DirectML resulted in passing the CPU load to GPU, but apparently only works correctly in the full model, I'll leave the link of the modified model.

model modified for directml (AMD) : https://github.com/lshqqytiger/stable-diffusion-webui-directml

I also took the liberty of making a fork with your fixes, left the credits in the t2vpipeline update and at the beginning of the fork, I did it to make the fix more accessible

fork link : https://github.com/system3600/sd-webui-text2video-Directml-cpu

If you want me to remove the fork because it contains your code, just notify me :)

TheSloppiestOfJoes commented 1 year ago

I'm glad that worked! It was a very rough fix and I left it that way since I wasn't sure it was actually working. I work long hours this week, but on Sunday, I think I can make a pull request that will fix this without the need of a forked copy. Feel free to use my code :)

TheSloppiestOfJoes commented 1 year ago

@system3600 When attempting to make a fix (targeting the direct_ml fork of SD), I run into a new error:

RuntimeError: mat1 and mat2 must have the same dtype
Exception occurred: mat1 and mat2 must have the same dtype

Is this the issue you were referring to when you said "but apparently only works correctly in the full model?"

system3600 commented 1 year ago

@system3600 in fact I'm referring to the full version of the model without the half, the full 12 GB model works without errors

TheSloppiestOfJoes commented 1 year ago

@system3600 The owner of the repository merged my code to fix this. Since I am still having problems, can you please verify that it works without editing the code, now?

jppmo commented 1 year ago

I'm having the same problems as you guys using the full model. I was having this error aswell

RuntimeError: mat1 and mat2 must have the same dtype Exception occurred: mat1 and mat2 must have the same dtype

I tried @TheSloppiestOfJoes changes locally and now I'm getting a different error

RuntimeError: input must be 4-dimensional Exception occurred: input must be 4-dimensional

I'm using a RX6950xt with a 13th gen intel cpu

chlin501 commented 2 months ago

I tried updating the t2v_pipeline.py as the above comment, but it doesn't work. The error I observed in the console shows

Applying attention optimization: InvokeAI... done.
Model loaded in 2.2s (load weights from disk: 0.4s, create model: 0.4s, apply weights to model: 0.9s, apply float(): 0.4s).
text2video — The model selected is: <modelscope> (ModelScope-like)
 text2video extension for auto1111 webui
Git commit: 989f5cfe
Starting text2video
Pipeline setup
config namespace(framework='pytorch', task='text-to-video-synthesis', model={'type': 'latent-text-to-video-synthesis', 'model_args': {'ckpt_clip': 'open_clip_pytorch_model.bin', 'ckpt_unet': 'text2video_pytorch_model.pth', 'ckpt_autoencoder': 'VQGAN_autoencoder.pth', 'max_frames': 16, 'tiny_gpu': 1}, 'model_cfg': {'unet_in_dim': 4, 'unet_dim': 320, 'unet_y_dim': 768, 'unet_context_dim': 1024, 'unet_out_dim': 4, 'unet_dim_mult': [1, 2, 4, 4], 'unet_num_heads': 8, 'unet_head_dim': 64, 'unet_res_blocks': 2, 'unet_attn_scales': [1, 0.5, 0.25], 'unet_dropout': 0.1, 'temporal_attention': 'True', 'num_timesteps': 1000, 'mean_type': 'eps', 'var_type': 'fixed_small', 'loss_type': 'mse'}}, pipeline={'type': 'latent-text-to-video-synthesis'})
Traceback (most recent call last):
  File "stable-diffusion-webui-1.7.0/stable-diffusion-webui/extensions/sd-webui-text2video/scripts/t2v_helpers/render.py", line 30, in run
    vids_pack = process_modelscope(args_dict, args)
  File "stable-diffusion-webui-1.7.0/stable-diffusion-webui/extensions/sd-webui-text2video/scripts/modelscope/process_modelscope.py", line 66, in process_modelscope
    pipe = setup_pipeline(args.model)
  File "stable-diffusion-webui-1.7.0/stable-diffusion-webui/extensions/sd-webui-text2video/scripts/modelscope/process_modelscope.py", line 32, in setup_pipeline
    return TextToVideoSynthesis(get_model_location(model_name))
  File "stable-diffusion-webui-1.7.0/stable-diffusion-webui/extensions/sd-webui-text2video/scripts/modelscope/t2v_pipeline.py", line 115, in __init__
    self.diffusion = Txt2VideoSampler(self.sd_model, shared.device, betas=betas)
  File "stable-diffusion-webui-1.7.0/stable-diffusion-webui/extensions/sd-webui-text2video/scripts/samplers/samplers_common.py", line 102, in __init__
    self.sampler = self.get_sampler(sampler_name, betas=self.betas)
  File "stable-diffusion-webui-1.7.0/stable-diffusion-webui/extensions/sd-webui-text2video/scripts/samplers/samplers_common.py", line 152, in get_sampler
    sampler = Sampler.init_sampler(self.sd_model, betas=betas, device=self.device)
  File "stable-diffusion-webui-1.7.0/stable-diffusion-webui/extensions/sd-webui-text2video/scripts/samplers/samplers_common.py", line 87, in init_sampler
    return self.Sampler(sd_model, betas=betas, **kwargs)
  File "stable-diffusion-webui-1.7.0/stable-diffusion-webui/extensions/sd-webui-text2video/scripts/samplers/uni_pc/sampler.py", line 12, in __init__
    self.register_buffer('alphas_cumprod', to_torch(model.alphas_cumprod))
  File "stable-diffusion-webui-1.7.0/stable-diffusion-webui/extensions/sd-webui-text2video/scripts/samplers/uni_pc/sampler.py", line 17, in register_buffer
    attr = attr.to(torch.device("cuda"))
  File "stable-diffusion-webui-1.7.0/.venv/lib/python3.10/site-packages/torch/cuda/__init__.py", line 298, in _lazy_init
    torch._C._cuda_init()
RuntimeError: No CUDA GPUs are available
Exception occurred: No CUDA GPUs are available

I have also configured with CPU through Settings > Text2Video > VAE Mode setting to CPU (Low VRAM). But the error remains the same - no CUDA GPUs are available.

Any suggestions? Thanks