patientx / ComfyUI-Zluda

The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface. Now ZLUDA enhanced for better AMD GPU performance.
GNU General Public License v3.0
162 stars 11 forks source link

Error: No module named 'safetensors' #3

Closed deepfold9118 closed 6 months ago

deepfold9118 commented 6 months ago

When running the install.bat, everything proceeds up to the point when running main.py. The following error is generated:

Traceback (most recent call last): File "C:\Files\stable_diffusion\releases\ComfyUI-Zluda\main.py", line 73, in import comfy.utils File "C:\Files\stable_diffusion\releases\ComfyUI-Zluda\comfy\utils.py", line 5, in import safetensors.torch ModuleNotFoundError: No module named 'safetensors'

I also had additional errors with the following packages: yaml (PyYAML), pstutil, einops, and transformers.

After installing these packages manually, ComfyUI runs.

Neku222 commented 6 months ago

I've manually entered venv and installed missing modules. Not only safetensors is missing. Yaml and psutil too. After that I still got: "RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx" with RX 5700XT.

If you want to enter that venv. Open cmd in \ComfyUI and type: venv\scripts\activate After that you can install stuff using pip: example: pip install safetensors If you want to install yaml, type: pip install pyyaml Dunno if you need matching versions to get ComfyUI to work.

Neku222 commented 6 months ago

Oh, so it worked for you. Guess in my case Zluda won't launch with my GPU. What is your GPU?

deepfold9118 commented 6 months ago

My GPU is XFX Speedster Merc 319 6800XT / 16GB

I was able to get ComfyUI to run, but not to generate an image, see here: https://github.com/comfyanonymous/ComfyUI/pull/2829#issuecomment-2096622000

Neku222 commented 6 months ago

I see. I managed to only get Rocm with ComfyUI working on Linux with my RX 5700XT, but I had to use 5.1 version. On Windows nothing worked at all. DirectML of course worked but very slow.

patientx commented 6 months ago

When running the install.bat, everything proceeds up to the point when running main.py. The following error is generated:

Traceback (most recent call last): File "C:\Files\stable_diffusion\releases\ComfyUI-Zluda\main.py", line 73, in import comfy.utils File "C:\Files\stable_diffusion\releases\ComfyUI-Zluda\comfy\utils.py", line 5, in import safetensors.torch ModuleNotFoundError: No module named 'safetensors'

I also had additional errors with the following packages: yaml (PyYAML), pstutil, einops, and transformers.

After installing these packages manually, ComfyUI runs.

I've manually entered venv and installed missing modules. Not only safetensors is missing. Yaml and psutil too. After that I still got: "RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx" with RX 5700XT.

If you want to enter that venv. Open cmd in \ComfyUI and type: venv\scripts\activate After that you can install stuff using pip: example: pip install safetensors If you want to install yaml, type: pip install pyyaml Dunno if you need matching versions to get ComfyUI to work.

For both of you :

MAKE SURE AMD HIP IS INSTALLED IT IS IN THE PATH.

First delete the zluda folder inside main folder then try running install.bat again it should install missing packages and install correct torch and torchvision over them uninstalling the cpu ones.

If there still nvidia errors ensure these three files with these file sizes are inside venv\Lib\site-packages\torch\lib\ :

nvrtc64_112_0.dll 125 KB cusparse64_11.dll 194 KB cublas64_11.dll 196 KB

There should a zluda folder inside comfyui-zluda folder , copy "cublas.dll , cusparse.dll and nvrtc.dll" to that folder renaming them to the ones over if there are already files with larger sizes there , it isn't correctly patched, that is what gives the nvidia driver and cuda errors.

Also we just did install the whole thing to friend with a 6800xt everything worked correctly. Try not to miss a step please.

The first part of the install.bat , installs the standard requirements from comfyui proper, THEN uninstalls torch and torchvision WHICH are for cpu at that point, THEN installs torch & torchvision with rocm support. AFTER it copies 3 dll's I mentioned above to that torch lib directory to make the system think that there is a nvidia gpu that supports torch which actually uses zluda to convert them to be used by rocm.

If we have the correct torch that is patched with those 3 dll's , AND the modified py files that this release has, comfyui would automatically start using zluda.

c

Neku222 commented 6 months ago

Done everything from the beginning, and still: Traceback (most recent call last): File "M:\stable diffiusion\ComfyUI-Zluda\main.py", line 73, in <module> import comfy.utils File "M:\stable diffiusion\ComfyUI-Zluda\comfy\utils.py", line 5, in <module> import safetensors.torch ModuleNotFoundError: No module named 'safetensors' after running install.bat

AMD HIP is installed and it's in Path. Restarted PC for a good measure and it's the same. Filesizes are: nvrtc64_112_0.dll 125 KB cusparse64_11.dll 193 KB (?) cublas64_11.dll 196 KB

Also. Should I have that kind of error when requirements are installing: ERROR: Could not find a version that satisfies the requirement torchsde (from versions: none) ERROR: No matching distribution found for torchsde

patientx commented 6 months ago

Install manually then open cmd go to comfyui-zluda folder Python -m venv venv activate venv by :: venv\scripts\activate (enter) pip install -r requirements.txt pip uninstall torch torchvision -y pip install torch==2.2.0 torchvision --index-url https://download.pytorch.org/whl/cu118

Then delete zluda folder if it is there & run patchzluda.bat

Check the if three dll with correct sizes are there inside torch lib

If everything worked until this point, you can run

Neku222 commented 6 months ago

Install manually then open cmd go to comfyui-zluda folder Python -m venv venv activate venv by :: venv\scripts\activate (enter) pip install -r requirements.txt pip uninstall torch torchvision -y pip install torch==2.2.0 torchvision --index-url https://download.pytorch.org/whl/cu118

Then delete zluda folder if it is there & run patchzluda.bat

Check the if three dll with correct sizes are there inside torch lib

If everything worked until this point, you can run

Everything is correct. Still: RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx At this point I'm pretty sure that RX 5700XT (GTX1010) can't run zluda with ROCm 5.7.

Neku222 commented 6 months ago

Hmm. But with newest zluda i managed to run Geekbench 5 and it shows: "Device Name | AMD Radeon RX 5700 XT [ZLUDA]". Dunno what's wrong.

Neku222 commented 6 months ago

Ok. So I managed to get it to run. I dragged start.bat to zluda.exe to run it with zluda (stupid but it works). It seems that installing requirements.txt only install 2.7 GB torch for AMD. It ignores rest modules. After dragging start.bat i had to manually install like 8 modules one after another. After installing them all ComfyUI starts, and I can go to web interface. `Total VRAM 8176 MB, total RAM 49111 MB Set vram state to: NORMAL_VRAM --------------------------------ZLUDA------------------------------------ Detected ZLUDA, support for it is experimental and comfy may not work properly. Disabling cuDNN because ZLUDA does currently not support it. Disabling flash because ZLUDA does currently not support it. Enabling math_sdp. Disabling mem_efficient_sdp because ZLUDA does currently not support it. ------------------------------------------------------------------------- Device: cuda:0 AMD Radeon RX 5700 XT [ZLUDA] : native VAE dtype: torch.bfloat16 Using pytorch cross attention

Import times for custom nodes: 0.0 seconds: M:\stablediffiusion\ComfyUI-Zluda\custom_nodes\websocket_image_save.py

Starting server

To see the GUI go to: http://127.0.0.1:8188`

After that I still have error: M:\stablediffiusion\ComfyUI-Zluda\comfy\ldm\modules\attention.py:345: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:263.) out = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0, is_causal=False)

Neku222 commented 6 months ago

Wow it works now. But there is a problem. After first generation that took like 10 min as expected. torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 8.00 GiB. GPU 0 has a total capacity of 7.98 GiB of which 5.57 GiB is free. Of the allocated memory 2.15 GiB is allocated by PyTorch, and 63.34 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) So I have to use --lowvram then? If I want to generate 1024x1024?

patientx commented 6 months ago

what is your gpu ?

patientx commented 6 months ago

If still not working ,completely re-done the requirements.txt stuff, if you can please reinstall from zero. Meaning delete the comfyui-zluda folder and start from cloning if possible.

Neku222 commented 6 months ago

It works the way I've done it, but I can't force it to generate with higher resolutions. In 512x512 it's fast. With default checkpoint ~1.5 it/s. I guess now it's all about memory optimization. With Rocm in Linux I can generate even 2048x2048 even if it's a bit slow. With 1024x1024 with Zluda I still get "torch.cuda.OutOfMemoryError: CUDA out of memory.".

patientx commented 6 months ago

what is your gpu

Neku222 commented 6 months ago

what is your gpu

I typed my gpu model like 5 times. RX 5700XT(GFX1010) Navi10 architecture.

patientx commented 6 months ago

have you did the library part on readme ...

Neku222 commented 6 months ago

have you did the library part on readme ...

???????? You even read what I said before?

patientx commented 6 months ago

sorry answering questions on discord so just saw that you have 5700xt, as I wrote on readme I didn't try on cards lower than 6000 series but other might have , the only solution I have for you is : https://github.com/brknsoul/ROCmLibs/blob/main/ROCmLibs_Testing.7z this as the library if even with this you still get those errors , I don't have solution , sorry again...

Neku222 commented 6 months ago

But. As I said before. IT'S WORKING FINE if I don't set too high resolution. ComfyUI it's working on my RX 5700XT with Zluda. Error only occurs with higher resolution like 1024x1024. With 512x512 it's fast. I've been using that ROCmLibs_Testing from the very beginning.

patientx commented 6 months ago

always use comfyui-manager for easy of use, from that disable previews (preview none), use tiled-vae decode instead of vae decode with tile size as 512, don't have too much apps in the background. That's all I can say. I have a dedicated browser for ai stuff, and don't use other browser's or just open one tab at max to copy some prompts :).

And when starting comfyui you can try --lowvram option for example, to try to use system memory a bit more. I needed that on directml but not now . I have an 8 GB RX 6600. Our gpu's are similar on paper but probably that library isn't up to par with 6000+ gpu's from the point of memory management.

deepfold9118 commented 6 months ago

Install manually then open cmd go to comfyui-zluda folder Python -m venv venv activate venv by :: venv\scripts\activate (enter) pip install -r requirements.txt pip uninstall torch torchvision -y pip install torch==2.2.0 torchvision --index-url https://download.pytorch.org/whl/cu118

Then delete zluda folder if it is there & run patchzluda.bat

Check the if three dll with correct sizes are there inside torch lib

If everything worked until this point, you can run

I have deleted the ComfyUI folder, created a new environment, ensured that AMD HIP is installed and in the PATH, and I followed your recommendations precisely. And running with --lowvram produced the same persistent error:

Error occurred when executing CheckpointLoaderSimple:

CUDA error: operation not supported
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

File "C:\Files\stable_diffusion\releases\ComfyUI-Zluda\execution.py", line 151, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Files\stable_diffusion\releases\ComfyUI-Zluda\execution.py", line 81, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Files\stable_diffusion\releases\ComfyUI-Zluda\execution.py", line 74, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Files\stable_diffusion\releases\ComfyUI-Zluda\nodes.py", line 516, in load_checkpoint
out = comfy.sd.load_checkpoint_guess_config(ckpt_path, output_vae=True, output_clip=True, embedding_directory=folder_paths.get_folder_paths("embeddings"))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Files\stable_diffusion\releases\ComfyUI-Zluda\comfy\sd.py", line 473, in load_checkpoint_guess_config
model = model_config.get_model(sd, "model.diffusion_model.", device=inital_load_device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Files\stable_diffusion\releases\ComfyUI-Zluda\comfy\supported_models_base.py", line 60, in get_model
out = model_base.BaseModel(self, model_type=self.model_type(state_dict, prefix), device=device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Files\stable_diffusion\releases\ComfyUI-Zluda\comfy\model_base.py", line 62, in __init__
self.diffusion_model = unet_model(**unet_config, device=device, operations=operations)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Files\stable_diffusion\releases\ComfyUI-Zluda\comfy\ldm\modules\diffusionmodules\openaimodel.py", line 491, in __init__
operations.Linear(model_channels, time_embed_dim, dtype=self.dtype, device=device),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Files\stable_diffusion\releases\ComfyUI-Zluda\venv\Lib\site-packages\torch\nn\modules\linear.py", line 98, in __init__
self.weight = Parameter(torch.empty((out_features, in_features), **factory_kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Flor clarity, this is the console output after running python main.py.

C:\Files\stable_diffusion\releases\ComfyUI-Zluda>python main.py
Total VRAM 16368 MB, total RAM 65447 MB
Set vram state to: NORMAL_VRAM
***--------------------------------ZLUDA------------------------------------***
Detected ZLUDA, support for it is experimental and comfy may not work properly.
Disabling cuDNN because ZLUDA does currently not support it.
Disabling flash because ZLUDA does currently not support it.
Enabling math_sdp.
Disabling mem_efficient_sdp because ZLUDA does currently not support it.
***-------------------------------------------------------------------------***
Device: cuda:0 AMD Radeon RX 6800 XT [ZLUDA] : cudaMallocAsync
VAE dtype: torch.bfloat16
Using pytorch cross attention
Adding extra search path checkpoints C:\Files\stable_diffusion\models/checkpoints/
Adding extra search path clip C:\Files\stable_diffusion\models/clip/
Adding extra search path clip_vision C:\Files\stable_diffusion\models/clip_vision/
Adding extra search path configs C:\Files\stable_diffusion\models/configs/
Adding extra search path controlnet C:\Files\stable_diffusion\models/controlnet/
Adding extra search path embeddings C:\Files\stable_diffusion\models/embeddings/
Adding extra search path loras C:\Files\stable_diffusion\models/loras/
Adding extra search path upscale_models C:\Files\stable_diffusion\models/upscale_models/
Adding extra search path vae C:\Files\stable_diffusion\models/vae/

Import times for custom nodes:
   0.0 seconds: C:\Files\stable_diffusion\releases\ComfyUI-Zluda\custom_nodes\websocket_image_save.py

Starting server

To see the GUI go to: http://127.0.0.1:8188
patientx commented 6 months ago

this is for first time trying to generate something , right ? what model have you tried it with ? with your gpu there shouldnt be a memory problem at all ... looking back at normal comfyui, "--disable-cuda-malloc" is suggested for nvidia users back then. maybe try that , so : "python main.py --disable-cuda-malloc" or add it after "set COMMANDLINE_ARGS= " in start.bat . I never encountered this error before so don't have any knowledge.

deepfold9118 commented 6 months ago

I have tried 1.5 pruned, and sd xl 1.0, both produce the same error.

--disable-cuda-malloc just causes comfy to crash.

Kademo15 commented 6 months ago

I had the same issue and fixed it so basically @patientx your patch for the dll files is broken/wrong.

I dont get how this in your install.bat

curl -s -L https://github.com/lshqqytiger/ZLUDA/releases/download/rel.2804604c29b5fa36deca9ece219d3970b61d4c27/ZLUDA-windows-amd64.zip > zluda.zip tar -xf zluda.zip del zluda.zip copy zluda\cublas64_11.dll venv\Lib\site-packages\torch\lib\ /y copy zluda\cusparse64_11.dll venv\Lib\site-packages\torch\lib\ /y copy zluda\nvrtc64_112_0.dll venv\Lib\site-packages\torch\lib\ /y

is patching the zluda files because inside the release you download there aren't any 64_11 variants of the dll files. Somehow inside the torch lib after the install finished the files are present. But when i compared them to the originals from the zluda repo i found that they are way to big so smth isn't right

After downloading zluda myself and creating a copy and renaming the cublas and cusparse dll files and placing them inside the torch libs everything worked.

patientx commented 6 months ago

I had the same issue and fixed it so basically @patientx your patch for the dll files is broken/wrong.

I dont get how this in your install.bat

curl -s -L https://github.com/lshqqytiger/ZLUDA/releases/download/rel.2804604c29b5fa36deca9ece219d3970b61d4c27/ZLUDA-windows-amd64.zip > zluda.zip tar -xf zluda.zip del zluda.zip copy zluda\cublas64_11.dll venv\Lib\site-packages\torch\lib\ /y copy zluda\cusparse64_11.dll venv\Lib\site-packages\torch\lib\ /y copy zluda\nvrtc64_112_0.dll venv\Lib\site-packages\torch\lib\ /y

is patching the zluda files because inside the release you download there aren't any 64_11 variants of the dll files. Somehow inside the torch lib after the install finished the files are present. But when i compared them to the originals from the zluda repo i found that they are way to big so smth isn't right

After downloading zluda myself and creating a copy and renaming the cublas and cusparse dll files and placing them inside the torch libs everything worked.

Sorry , big sorry. It seems I messed up with target file names while changing the file last time 8 hours ago. Now the process is fixed in both install.bat and patchzluda.bat. Wiped and reinstalled from zero, everything is working as it should now.

unclemusclez commented 6 months ago

I have the same issure where if i drag and drop the bat file it works, but otherwise not, i get the RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx error.

i also use the link https://github.com/lshqqytiger/ZLUDA/releases/latest/download/ZLUDA-windows-amd64.zip and it launched. this is version zluda 3.7