kabachuha / sd-webui-text2video

Auto1111 extension implementing text2video diffusion models (like ModelScope or VideoCrafter) using only Auto1111 webui dependencies
Other
1.28k stars 108 forks source link

[Bug]: Torch not compiled with CUDA enabled #224

Open derodz opened 11 months ago

derodz commented 11 months ago

Is there an existing issue for this?

Are you using the latest version of the extension?

What happened?

I completed a basic install of the web extension but cannot produce any video.

Steps to reproduce the problem

  1. Install per instructions
  2. Select Modelscope
  3. Attempt to generate video with any prompt

What should have happened?

The extension should work on Mac and produce videos.

WebUI and Deforum extension Commit IDs

webui commit id - 68f336b txt2vid commit id - 20ead10

Torch version

2.0.1

What GPU were you using for launching?

M1 Max (used --use-cpu command line argument)

On which platform are you launching the webui backend with the extension?

Local PC setup (Mac)

Settings

image

Console logs

(base) user1@MBP ~ % cd ~/stable-diffusion-webui;./webui.sh

################################################################
Install script for stable-diffusion + Web UI
Tested on Debian 11 (Bullseye)
################################################################

################################################################
Running on user1 user
################################################################

################################################################
Repo already cloned, using it as install directory
################################################################

################################################################
Create and activate python venv
################################################################

################################################################
Launching launch.py...
################################################################
Python 3.10.12 (main, Jun 20 2023, 19:43:52) [Clang 14.0.3 (clang-1403.0.22.14.1)]
Version: v1.5.1
Commit hash: 68f336bd994bed5442ad95bad6b6ad5564a5409a

Launching Web UI with arguments: --skip-torch-cuda-test --upcast-sampling --no-half-vae --use-cpu interrogate
no module 'xformers'. Processing without...
no module 'xformers'. Processing without...
No module 'xformers'. Proceeding without it.
Warning: caught exception 'Torch not compiled with CUDA enabled', memory monitor disabled
Loading weights [6ce0161689] from /Users/user1/stable-diffusion-webui/models/Stable-diffusion/v1-5-pruned-emaonly.safetensors
Creating model from config: /Users/user1/stable-diffusion-webui/configs/v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Startup time: 4.0s (launcher: 0.2s, import torch: 1.4s, import gradio: 0.5s, setup paths: 0.5s, other imports: 0.4s, load scripts: 0.4s, reload hypernetworks: 0.1s, create ui: 0.4s).
Applying attention optimization: InvokeAI... done.
Model loaded in 2.4s (load weights from disk: 0.2s, create model: 0.8s, apply weights to model: 0.5s, apply half(): 0.2s, move model to device: 0.5s, calculate empty prompt: 0.1s).
text2video — The model selected is: ModelScope (ModelScope-like)
 text2video extension for auto1111 webui
Git commit: 20ead103
Starting text2video
Pipeline setup
config namespace(framework='pytorch', task='text-to-video-synthesis', model={'type': 'latent-text-to-video-synthesis', 'model_args': {'ckpt_clip': 'open_clip_pytorch_model.bin', 'ckpt_unet': 'text2video_pytorch_model.pth', 'ckpt_autoencoder': 'VQGAN_autoencoder.pth', 'max_frames': 16, 'tiny_gpu': 1}, 'model_cfg': {'unet_in_dim': 4, 'unet_dim': 320, 'unet_y_dim': 768, 'unet_context_dim': 1024, 'unet_out_dim': 4, 'unet_dim_mult': [1, 2, 4, 4], 'unet_num_heads': 8, 'unet_head_dim': 64, 'unet_res_blocks': 2, 'unet_attn_scales': [1, 0.5, 0.25], 'unet_dropout': 0.1, 'temporal_attention': 'True', 'num_timesteps': 1000, 'mean_type': 'eps', 'var_type': 'fixed_small', 'loss_type': 'mse'}}, pipeline={'type': 'latent-text-to-video-synthesis'})
Traceback (most recent call last):
  File "/Users/user1/stable-diffusion-webui/extensions/sd-webui-text2video/scripts/t2v_helpers/render.py", line 30, in run
    vids_pack = process_modelscope(args_dict, args)
  File "/Users/user1/stable-diffusion-webui/extensions/sd-webui-text2video/scripts/modelscope/process_modelscope.py", line 65, in process_modelscope
    pipe = setup_pipeline(args.model)
  File "/Users/user1/stable-diffusion-webui/extensions/sd-webui-text2video/scripts/modelscope/process_modelscope.py", line 31, in setup_pipeline
    return TextToVideoSynthesis(get_model_location(model_name))
  File "/Users/user1/stable-diffusion-webui/extensions/sd-webui-text2video/scripts/modelscope/t2v_pipeline.py", line 113, in __init__
    self.diffusion = Txt2VideoSampler(self.sd_model, shared.device, betas=betas)
  File "/Users/user1/stable-diffusion-webui/extensions/sd-webui-text2video/scripts/samplers/samplers_common.py", line 102, in __init__
    self.sampler = self.get_sampler(sampler_name, betas=self.betas)
  File "/Users/user1/stable-diffusion-webui/extensions/sd-webui-text2video/scripts/samplers/samplers_common.py", line 152, in get_sampler
    sampler = Sampler.init_sampler(self.sd_model, betas=betas, device=self.device)
  File "/Users/user1/stable-diffusion-webui/extensions/sd-webui-text2video/scripts/samplers/samplers_common.py", line 87, in init_sampler
    return self.Sampler(sd_model, betas=betas, **kwargs)
  File "/Users/user1/stable-diffusion-webui/extensions/sd-webui-text2video/scripts/samplers/uni_pc/sampler.py", line 12, in __init__
    self.register_buffer('alphas_cumprod', to_torch(model.alphas_cumprod))
  File "/Users/user1/stable-diffusion-webui/extensions/sd-webui-text2video/scripts/samplers/uni_pc/sampler.py", line 17, in register_buffer
    attr = attr.to(torch.device("cuda"))
  File "/Users/user1/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/cuda/__init__.py", line 239, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
Exception occurred: Torch not compiled with CUDA enabled

Additional information

No response

kabachuha commented 11 months ago

Are you able to use normal webui's functions like text2image?

kabachuha commented 11 months ago

If you're running on CPU, try selecting CPU for that selection above as well

derodz commented 11 months ago

Are you able to use normal webui's functions like text2image?

Yes, both txt2img and img2img work great

derodz commented 11 months ago

If you're running on CPU, try selecting CPU for that selection above as well

What commands do I run? I tried the following and I received the same error:

--use-cpu all --precision full --no-half --skip-torch-cuda-test

kabachuha commented 11 months ago

Oh, well then. I recall there are unconditional 'cuda' references in the code, so I'll need to look through them

derodz commented 11 months ago

Oh, well then. I recall there are unconditional 'cuda' references in the code, so I'll need to look through them

I tried to modify the following line: File "/Users/user1/stable-diffusion-webui/extensions/sd-webui-text2video/scripts/samplers/uni_pc/sampler.py", line 17, in register_buffer attr = attr.to(torch.device("cuda"))

to

attr = attr.to(torch.device("mps")) and attr = attr.to(torch.device("cpu"))

and instead I get the following error

Traceback (most recent call last): File "/Users/user1/stable-diffusion-webui/extensions/sd-webui-text2video/scripts/t2v_helpers/render.py", line 30, in run vids_pack = process_modelscope(args_dict, args) File "/Users/user1/stable-diffusion-webui/extensions/sd-webui-text2video/scripts/modelscope/process_modelscope.py", line 220, in process_modelscope samples, _ = pipe.infer(args.prompt, args.n_prompt, args.steps, args.frames, args.seed + batch if args.seed != -1 else -1, args.cfg_scale, File "/Users/user1/stable-diffusion-webui/extensions/sd-webui-text2video/scripts/modelscope/t2v_pipeline.py", line 275, in infer self.sd_model.to(self.device) File "/Users/user1/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1145, in to return self._apply(convert) File "/Users/user1/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 844, in _apply self._buffers[key] = fn(buf) File "/Users/user1/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1143, in convert return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead. Exception occurred: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.

mengxun commented 9 months ago

Same error

peterschmidler commented 8 months ago

Same error on M2

ManuelW77 commented 7 months ago

mps didn't work with float64 as described here https://github.com/DLR-RM/stable-baselines3/issues/914