LarryJane491 / Lora-Training-in-Comfy

This custom node lets you train LoRA directly in ComfyUI!
364 stars 50 forks source link

python cant open train_network.py [errno2] no such file or directory #7

Open Wreckitchin opened 8 months ago

Wreckitchin commented 8 months ago

got prompt [] The following values were not passed to accelerate launch and had defaults used instead: --num_processes was set to a value of 1 --num_machines was set to a value of 1 --mixed_precision was set to a value of 'no' --dynamo_backend was set to a value of 'no' To avoid this warning pass in values for each of the problematic parameters or run accelerate config. C:\Users\MRIRONGOLD\AppData\Local\Programs\Python\Python310\python.exe: can't open file 'H:\AI Art\ComfyUI_windows_portable\ComfyUI_windows_portable\custom_nodes\Lora-Training-in-Comfy\sd-scripts\train_network.py': [Errno 2] No such file or directory Traceback (most recent call last): File "C:\Users\MRIRONGOLD\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\MRIRONGOLD\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "C:\Users\MRIRONGOLD\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\launch.py", line 996, in main() File "C:\Users\MRIRONGOLD\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\launch.py", line 992, in main launch_command(args) File "C:\Users\MRIRONGOLD\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\launch.py", line 986, in launch_command simple_launcher(args) File "C:\Users\MRIRONGOLD\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\launch.py", line 628, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['C:\Users\MRIRONGOLD\AppData\Local\Programs\Python\Python310\python.exe', 'custom_nodes/Lora-Training-in-Comfy/sd-scripts/train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=H:\AI', 'Art\ComfyUI_windows_portable\ComfyUI_windows_portable\ComfyUI\models\checkpoints\v1-5-pruned-emaonly.ckpt', '--train_data_dir=C:/Database/5_test', '--output_dir=models/loras', '--logging_dir=./logs', '--log_prefix=Test', '--resolution=512,512', '--network_module=networks.lora', '--max_train_epochs=40', '--learning_rate=1e-4', '--unet_lr=1e-4', '--text_encoder_lr=1e-5', '--lr_scheduler=cosine_with_restarts', '--lr_warmup_steps=0', '--lr_scheduler_num_cycles=1', '--network_dim=32', '--network_alpha=32', '--output_name=Test', '--train_batch_size=1', '--save_every_n_epochs=5', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=27', '--cache_latents', '--prior_loss_weight=1', '--max_token_length=225', '--caption_extension=.txt', '--save_model_as=safetensors', '--min_bucket_reso=256', '--max_bucket_reso=1584', '--keep_tokens=0', '--xformers', '--shuffle_caption', '--clip_skip=2', '--optimizer_type=AdamW8bit', '--persistent_data_loader_workers', '--log_with=tensorboard']' returned non-zero exit status 2. Train finished Prompt executed in 4.16 seconds

romainoir commented 8 months ago

similar error: The following values were not passed to accelerate launch and had defaults used instead: --num_processes was set to a value of 1 --num_machines was set to a value of 1 --mixed_precision was set to a value of 'no' --dynamo_backend was set to a value of 'no' To avoid this warning pass in values for each of the problematic parameters or run accelerate config. C:\Users\Calcul\AppData\Local\Programs\Python\Python310\python.exe: can't open file 'D:\ComfyUI_windows_portable\custom_nodes\Lora-Training-in-Comfy\sd-scripts\train_network.py': [Errno 2] No such file or directory Traceback (most recent call last): File "C:\Users\Calcul\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\Calcul\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "C:\Users\Calcul\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\launch.py", line 1033, in main() File "C:\Users\Calcul\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\launch.py", line 1029, in main launch_command(args) File "C:\Users\Calcul\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\launch.py", line 1023, in launch_command simple_launcher(args) File "C:\Users\Calcul\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\launch.py", line 643, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['C:\Users\Calcul\AppData\Local\Programs\Python\Python310\python.exe', 'custom_nodes/Lora-Training-in-Comfy/sd-scripts/train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=D:\ComfyUI_windows_portable\ComfyUI\models\checkpoints\nextphoto_v10.safetensors', '--train_data_dir=D:/Lora/textures/images/20_textures', '--output_dir=D:\ComfyUI_windows_portable\ComfyUI\models\loras', '--logging_dir=./logs', '--log_prefix=textures', '--resolution=512,512', '--network_module=networks.lora', '--max_train_epochs=10', '--learning_rate=1e-4', '--unet_lr=1e-4', '--text_encoder_lr=1e-5', '--lr_scheduler=cosine_with_restarts', '--lr_warmup_steps=0', '--lr_scheduler_num_cycles=1', '--network_dim=32', '--network_alpha=32', '--output_name=textures', '--train_batch_size=1', '--save_every_n_epochs=10', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=8', '--cache_latents', '--prior_loss_weight=1', '--max_token_length=225', '--caption_extension=.txt', '--save_model_as=safetensors', '--min_bucket_reso=256', '--max_bucket_reso=1584', '--keep_tokens=0', '--xformers', '--shuffle_caption', '--clip_skip=3', '--optimizer_type=AdamW8bit', '--persistent_data_loader_workers', '--log_with=tensorboard']' returned non-zero exit status 2. Train finished Prompt executed in 3.42 seconds

LarryJane491 commented 8 months ago

I see in both cases that you included the full path to the images. Don't. ^^' As I explained in the tutorial and the main page, data_path must be the path to the folder containing that folder.

@romainoir : remove /20_textures from the path

@Wreckitchin : Your path is the one I wrote as an example. Is this the real path to your dataset? Assuming it is, remove the last part too.

I wish the user could select the path to the images directly, sadly it's not possible. Every other Lora training tool does the same, because the scripts we all use require the images to be in a folder containing a folder named [number]_[something].

Wreckitchin commented 8 months ago

image image2

Does this look correct?

im still getting this error

The following values were not passed to accelerate launch and had defaults used instead: --num_processes was set to a value of 1 --num_machines was set to a value of 1 --mixed_precision was set to a value of 'no' --dynamo_backend was set to a value of 'no' To avoid this warning pass in values for each of the problematic parameters or run accelerate config. C:\Users\MRIRONGOLD\AppData\Local\Programs\Python\Python310\python.exe: can't open file 'H:\AI Art\ComfyUI_windows_portable\ComfyUI_windows_portable\custom_nodes\Lora-Training-in-Comfy\sd-scripts\train_network.py': [Errno 2] No such file or directory Traceback (most recent call last): File "C:\Users\MRIRONGOLD\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\MRIRONGOLD\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "C:\Users\MRIRONGOLD\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\launch.py", line 996, in main() File "C:\Users\MRIRONGOLD\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\launch.py", line 992, in main launch_command(args) File "C:\Users\MRIRONGOLD\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\launch.py", line 986, in launch_command simple_launcher(args) File "C:\Users\MRIRONGOLD\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\launch.py", line 628, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['C:\Users\MRIRONGOLD\AppData\Local\Programs\Python\Python310\python.exe', 'custom_nodes/Lora-Training-in-Comfy/sd-scripts/train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=H:\AI', 'Art\ComfyUI_windows_portable\ComfyUI_windows_portable\ComfyUI\models\checkpoints\v1-5-pruned-emaonly.ckpt', '--train_data_dir=C:/Database', '--output_dir=models/loras', '--logging_dir=./logs', '--log_prefix=Test', '--resolution=512,512', '--network_module=networks.lora', '--max_train_epochs=40', '--learning_rate=1e-4', '--unet_lr=1e-4', '--text_encoder_lr=1e-5', '--lr_scheduler=cosine_with_restarts', '--lr_warmup_steps=0', '--lr_scheduler_num_cycles=1', '--network_dim=32', '--network_alpha=32', '--output_name=Test', '--train_batch_size=1', '--save_every_n_epochs=5', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=14', '--cache_latents', '--prior_loss_weight=1', '--max_token_length=225', '--caption_extension=.txt', '--save_model_as=safetensors', '--min_bucket_reso=256', '--max_bucket_reso=1584', '--keep_tokens=0', '--xformers', '--shuffle_caption', '--clip_skip=2', '--optimizer_type=AdamW8bit', '--persistent_data_loader_workers', '--log_with=tensorboard']' returned non-zero exit status 2. Train finished Prompt executed in 5.00 seconds

LarryJane491 commented 8 months ago

Hmm, now it's correct indeed. I'm not confident enough to reply, I'd need more information:

Did you download from github using the "Download as ZIP" button? If you did, the folder would be called "Lora-Training-in-Comfy-main". Remove the "-main".

Does your ComfyUI folder have a venv folder? Or a python_embedded folder? I think that could be the issue. Did you use the one-click install for ComfyUI? I did too, but then ditched it due to Python dependency issues. I think that could be the problem here. I'll dig a bit deeper here.

Wreckitchin commented 8 months ago

image3 image4

i downloaded it through the comfyui manager by just pasting in the github link everything seems to be in the right place tho

i am using the portable version of comfyui so maybe thats the issue

romainoir commented 8 months ago

@LarryJane491 Thank you for your quick reply Same as @Wreckitchin, It's a portable version of comfyUI and I've installed your repo via the manager.

Since There was that "accelerate error", I've install it locally (without venv) with python 3.10 ...but still not working. I've also try this : PS D:\ComfyUI_windows_portable\python_embeded> .\python.exe -m pip install accelerate but it seems it's already there : Requirement already satisfied: accelerate in d:\comfyui_windows_portable\python_embeded\lib\site-packages (0.24.1)

LarryJane491 commented 8 months ago

Got you guys. Even Google says there is no easy way to do it, so I'll need you to follow me on this one. Take your time if needed, but follow all my instructions.

When running the portable version, we can't install packages easily because we need to install pip first. The good thing is the developer of the one-click install did almost all the work. The bad thing is... they stopped midway x). They added everything to install pip BUT didn't add pip.

Once this final download is ready, you can launch ComfyUI and hope that this worked.

If it didn't, I'll recommend to install Comfy without its one-click install. I'll make a tutorial for it. This method will guarantee success, but I figure you may not want to reinstall Comfy just for my node ^^.

Wreckitchin commented 8 months ago

Collecting ffmpy (from gradio==3.41.2->-r H:\AI Art\ComfyUI_windows_portable\ComfyUI_windows_portable\ComfyUI\custom_nodes\Lora-Training-in-Comfy\requirements_win.txt (line 23)) Using cached ffmpy-0.3.1.tar.gz (5.5 kB) Preparing metadata (setup.py) ... error error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [6 lines of output] Traceback (most recent call last): File "", line 2, in File "", line 34, in File "C:\Users\MRIRONGOLD\AppData\Local\Temp\pip-install-4xwcigf4\ffmpy_c287d7f6b053497380308a0bba3dcec3\setup.py", line 4, in from ffmpy import version ModuleNotFoundError: No module named 'ffmpy' [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed

× Encountered error while generating package metadata. ╰─> See above for output.

note: This is an issue with the package mentioned above, not pip. hint: See above for details.

im getting this error when its installing ffmpy

romainoir commented 8 months ago

on my side I've followed your instructions to install the specific requirement. no error so far. Rebooting the server and launching the training I still have an error:

The following values were not passed to accelerate launch and had defaults used instead: --num_processes was set to a value of 1 --num_machines was set to a value of 1 --mixed_precision was set to a value of 'no' --dynamo_backend was set to a value of 'no' To avoid this warning pass in values for each of the problematic parameters or run accelerate config. C:\Users\Calcul\AppData\Local\Programs\Python\Python310\python.exe: can't open file 'D:\ComfyUI_windows_portable\custom_nodes\Lora-Training-in-Comfy\sd-scripts\train_network.py': [Errno 2] No such file or directory Traceback (most recent call last): File "C:\Users\Calcul\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\Calcul\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "C:\Users\Calcul\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\launch.py", line 1033, in main() File "C:\Users\Calcul\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\launch.py", line 1029, in main launch_command(args) File "C:\Users\Calcul\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\launch.py", line 1023, in launch_command simple_launcher(args) File "C:\Users\Calcul\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\commands\launch.py", line 643, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['C:\Users\Calcul\AppData\Local\Programs\Python\Python310\python.exe', 'custom_nodes/Lora-Training-in-Comfy/sd-scripts/train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=D:\ComfyUI_windows_portable\ComfyUI\models\checkpoints\nextphoto_v10.safetensors', '--train_data_dir=D:/Lora/textures/images', '--output_dir=models/loras', '--logging_dir=./logs', '--log_prefix=TexturA', '--resolution=512,512', '--network_module=networks.lora', '--max_train_epochs=10', '--learning_rate=1e-4', '--unet_lr=1e-4', '--text_encoder_lr=1e-5', '--lr_scheduler=cosine_with_restarts', '--lr_warmup_steps=0', '--lr_scheduler_num_cycles=1', '--network_dim=32', '--network_alpha=32', '--output_name=TexturA', '--train_batch_size=1', '--save_every_n_epochs=10', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=28', '--cache_latents', '--prior_loss_weight=1', '--max_token_length=225', '--caption_extension=.txt', '--save_model_as=safetensors', '--min_bucket_reso=256', '--max_bucket_reso=1584', '--keep_tokens=0', '--xformers', '--shuffle_caption', '--clip_skip=2', '--optimizer_type=AdamW8bit', '--persistent_data_loader_workers', '--log_with=tensorboard']' returned non-zero exit status 2. Train finished Prompt executed in 4.02 seconds

'D:\ComfyUI_windows_portable\custom_nodes\Lora-Training-in-Comfy\sd-scripts\train_network.py' from the error "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\Lora-Training-in-Comfy\sd-scripts\train_network.py" from windows explorer

LarryJane491 commented 8 months ago

Okay, this is the point where I have to recommend a fresh install of Comfy. You don't have to uninstall yours! What matters is you don't use the one click install this time.

Good thing I had been preparing a Python guide. I finished it and just posted it on Reddit:

https://www.reddit.com/r/comfyui/comments/1995whb/guide_learn_to_deal_with_python_programs/

Although it's general, I use Comfy and my custom node as an example. Follow it up until you can run the Training node. Then put the model you want to use for LoRA training in the checkpoint folder of that new install. Refresh in Comfy, fill the Training node, and launch training.

In the worst case scenario, the last thing I can think of is install Python 3.11. I don't see why 3.10 wouldn't work, but it's my last resort ^^'.

Wreckitchin commented 8 months ago

Tried your solution and it worked its strange that i couldnt get it to work with the portable version but oh well im probably better off with this version anyways.

Thanks for the guide and helping us out.

LarryJane491 commented 8 months ago

Oof, glad it worked out ^^. I couldn't get it to work on the portable version either, so it's not just you, don't worry.

alexbofa commented 8 months ago

the same problem

OliviaOliveiira commented 8 months ago

Any thoughts about making it work on portable?)

LarryJane491 commented 8 months ago

@OliviaOliveiira : I would if I could x). The thing is the problem doesn't come from the node. It comes from how Python works. Even though it didn't work for the others, I suggest trying this method :

Got you guys. Even Google says there is no easy way to do it, so I'll need you to follow me on this one. Take your time if needed, but follow all my instructions.

When running the portable version, we can't install packages easily because we need to install pip first. The good thing is the developer of the one-click install did almost all the work. The bad thing is... they stopped midway x). They added everything to install pip BUT didn't add pip.

  • Go in the python_embeded folder. Open a command prompt here (if you don't know how: click on the path, erase it all, type cmd and press Enter).
  • In the folder, do you see "get-pip.py"? It's a file to install pip inside the embedded python. In order to do that: type python get-pip.py in the command prompt and press Enter. This launches the download of pip.
  • After the download, type this in the prompt: .\Scripts\pip install -r [WITH A SPACE AFTER -r and don't press Enter yet!]
  • After this, go in the Lora-Training-in-Comfy folder. Grab the requirements file (requirements_win.txt if you're on windows) and drag it in the command prompt. This will write that file path in the prompt, after the -r. Again, there should be a space between the r and the path. Press Enter now. This installs the requirements INSIDE the embedded Python.
  • Go to https://pytorch.org/. On this website you'll see an Install Pytorch section, where you have several options (for OS, for for Pytorch Build, ...). Select the right options for your situation. If you have a RTX GPU, you must choose CUDA 12.1. This section gives you a line of code. Copy it.
  • Go back to the command prompt, type .\Scripts\ then paste the code the site gave you. In my case the final code looks like this: .\Scripts\pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121.
  • Then press Enter. It launches yet another download.

Once this final download is ready, you can launch ComfyUI and hope that this worked.

If it didn't, I'll recommend to install Comfy without its one-click install. I'll make a tutorial for it. This method will guarantee success, but I figure you may not want to reinstall Comfy just for my node ^^.

Obviously it's not guaranteed to work, but that's all I can think of. Using get-pip to install packages inside the embedded python is the standard solution to that kind of issue. Honestly though, I warmly recommend installing a "base" version of Python, as I explained above. You don't need to erase your current folder.

@alexbofa : I can't help without more information ^^'. But for now try the solutions above. As a reminder:

OliviaOliveiira commented 8 months ago

We've found the way to fix it in the meantime, it all was because of paths fault. You have to modify train.py and change the path to train_network.py in the end to ComfyUI/custom_nodes/Lora-Training-in-Comfy/sd-scripts/train_network.py

LarryJane491 commented 8 months ago

Ahhhhh, nice! It makes sense, the working directory is one step above (compared to the base version). I hadn't realized it before. The launcher is one step above indeed...

I am rewriting this line right now (like, literally right now). This requirement was too strict anyway. I'll try if that makes it work in both versions at once.

LarryJane491 commented 8 months ago

I've pushed an update. Aside from adding new nodes, I made the folder structure for the node less annoying. The name and number of folders between the program and the training script don't matter anymore. Therefore, it SHOULD work with all versions of Comfy, whether it's the portable version or not, as long as you install the requirements in the right folder.

.... But I have a cuDNN error trying to run it in the portable version x). Looking it up online, I'm pretty sure I just don't have the right version of cuDNN in the right place.

Anyway, can somebody try the new version in a portable ComfyUI? You can git pull or redownload, now it doesn't matter how you get the node.

OliviaOliveiira commented 8 months ago

Owner Forgot to mention that we used Kohya's venv to run it all, you'd have to get a better list of requirements to make it all work!)

alexbofa commented 8 months ago

yes, using venv and requirements.txt for installing from this repository helps https://github.com/serpotapov/Kohya_ss-GUI-LoRA-Portable

LarryJane491 commented 8 months ago

@OliviaOliveiira : I made it work without using Kohya's requirements, so no. There's probably a requirements conflict that gets solved when you use Kohya's requirements, but the node doesn't need it to work on its own.

uraniumcrystalsmaster commented 8 months ago

I've pushed an update. Aside from adding new nodes, I made the folder structure for the node less annoying. The name and number of folders between the program and the training script don't matter anymore. Therefore, it SHOULD work with all versions of Comfy, whether it's the portable version or not, as long as you install the requirements in the right folder.

.... But I have a cuDNN error trying to run it in the portable version x). Looking it up online, I'm pretty sure I just don't have the right version of cuDNN in the right place.

Anyway, can somebody try the new version in a portable ComfyUI? You can git pull or redownload, now it doesn't matter how you get the node.

I download the github's zip file on January 29th and had the same issue as the first report: https://github.com/LarryJane491/Lora-Training-in-Comfy/issues/7#issue-2086381525, but with --num_processes set to 0. I tried to fix it completing every procedure provided in this issue. After completing this step: https://github.com/LarryJane491/Lora-Training-in-Comfy/issues/7#issuecomment-1896260499, ComfyUI no longer opens, and gives the following error:

C:\Users\urani\ComfyUI_windows_portable>.\python_embeded\python.exe -s ComfyUI\main.py --windows-standalone-build --force-fp16
** ComfyUI startup time: 2024-02-01 00:41:50.696105
** Platform: Windows
** Python version: 3.11.6 (tags/v3.11.6:8b6ee5b, Oct  2 2023, 14:57:12) [MSC v.1935 64 bit (AMD64)]
** Python executable: C:\Users\urani\ComfyUI_windows_portable\python_embeded\python.exe
** Log path: C:\Users\urani\ComfyUI_windows_portable\comfyui.log

Prestartup times for custom nodes:
   0.0 seconds: C:\Users\urani\ComfyUI_windows_portable\ComfyUI\custom_nodes\rgthree-comfy
   0.1 seconds: C:\Users\urani\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Manager

Traceback (most recent call last):
  File "C:\Users\urani\ComfyUI_windows_portable\ComfyUI\main.py", line 76, in <module>
    import execution
  File "C:\Users\urani\ComfyUI_windows_portable\ComfyUI\execution.py", line 11, in <module>
    import nodes
  File "C:\Users\urani\ComfyUI_windows_portable\ComfyUI\nodes.py", line 20, in <module>
    import comfy.diffusers_load
  File "C:\Users\urani\ComfyUI_windows_portable\ComfyUI\comfy\diffusers_load.py", line 3, in <module>
    import comfy.sd
  File "C:\Users\urani\ComfyUI_windows_portable\ComfyUI\comfy\sd.py", line 3, in <module>
    from comfy import model_management
  File "C:\Users\urani\ComfyUI_windows_portable\ComfyUI\comfy\model_management.py", line 118, in <module>
    total_vram = get_total_memory(get_torch_device()) / (1024 * 1024)
                                  ^^^^^^^^^^^^^^^^^^
  File "C:\Users\urani\ComfyUI_windows_portable\ComfyUI\comfy\model_management.py", line 87, in get_torch_device
    return torch.device(torch.cuda.current_device())
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\urani\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\cuda\__init__.py", line 674, in current_device
    _lazy_init()
  File "C:\Users\urani\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\cuda\__init__.py", line 239, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

It will take a long time to cut and paste all the large files over to another Comfyui installation, so is there a fix to this or a way to undo the process you explained: https://github.com/LarryJane491/Lora-Training-in-Comfy/issues/7#issuecomment-1896260499

LarryJane491 commented 8 months ago

@uraniumcrystalsmaster : Hey buddy. I know the problem. Torch isn't using CUDA, and ComfyUI won't open without it using CUDA. The solution is simple: go to Pytorch.org, the page has a tab that gives you a code line. Tick the right options for your situation, it gives you a line of code. Then use that code line in a command prompt to install it. Since you're using the portable version, make sure to install it in the embedded python folder.

You haven't lost anything and don't need to delete/move anything. Just make sure to install "Torch with CUDA".

uraniumcrystalsmaster commented 8 months ago

As stated, I installed it from Pytorch.org into the python embedded folder before. Installing it again did not fix the error. I set the package to pip, but was I supposed to choose libTorch or Source? I installed Cuda 11.8 because I have Nvidia GTX 1060 6GB, and not an RTX graphics card. Here's what I did in the cmd prompt: ''' Microsoft Windows [Version 10.0.19045.3930] (c) Microsoft Corporation. All rights reserved.

C:\Users\urani\ComfyUI_windows_portable\python_embeded>pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 Looking in indexes: https://download.pytorch.org/whl/cu118 Requirement already satisfied: torch in c:\users\urani\appdata\local\programs\python\python310\lib\site-packages (1.13.1) Requirement already satisfied: torchvision in c:\users\urani\appdata\local\programs\python\python310\lib\site-packages (0.14.1) Collecting torchaudio Downloading https://download.pytorch.org/whl/cu118/torchaudio-2.2.0%2Bcu118-cp310-cp310-win_amd64.whl (4.0 MB) ---------------------------------------- 4.0/4.0 MB 8.2 MB/s eta 0:00:00 Requirement already satisfied: typing-extensions in c:\users\urani\appdata\local\programs\python\python310\lib\site-packages (from torch) (4.5.0) Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in c:\users\urani\appdata\local\programs\python\python310\lib\site-packages (from torchvision) (9.4.0) Requirement already satisfied: requests in c:\users\urani\appdata\local\programs\python\python310\lib\site-packages (from torchvision) (2.28.2) Requirement already satisfied: numpy in c:\users\urani\appdata\local\programs\python\python310\lib\site-packages (from torchvision) (1.24.2) Downloading https://download.pytorch.org/whl/cu118/torchaudio-2.1.2%2Bcu118-cp310-cp310-win_amd64.whl (3.9 MB) ---------------------------------------- 3.9/3.9 MB 13.9 MB/s eta 0:00:00 Downloading https://download.pytorch.org/whl/cu118/torchaudio-2.1.1%2Bcu118-cp310-cp310-win_amd64.whl (3.9 MB) ---------------------------------------- 3.9/3.9 MB 11.9 MB/s eta 0:00:00 Downloading https://download.pytorch.org/whl/cu118/torchaudio-2.1.0%2Bcu118-cp310-cp310-win_amd64.whl (3.9 MB) ---------------------------------------- 3.9/3.9 MB 10.4 MB/s eta 0:00:00 Downloading https://download.pytorch.org/whl/cu118/torchaudio-2.0.2%2Bcu118-cp310-cp310-win_amd64.whl (2.5 MB) ---------------------------------------- 2.5/2.5 MB 10.4 MB/s eta 0:00:00 Downloading https://download.pytorch.org/whl/cu118/torchaudio-2.0.1%2Bcu118-cp310-cp310-win_amd64.whl (2.5 MB) ---------------------------------------- 2.5/2.5 MB 17.4 MB/s eta 0:00:00 Downloading https://download.pytorch.org/whl/cu118/torchaudio-2.0.0%2Bcu118-cp310-cp310-win_amd64.whl (2.5 MB) ---------------------------------------- 2.5/2.5 MB 17.5 MB/s eta 0:00:00 INFO: pip is looking at multiple versions of torchvision to determine which version is compatible with other requirements. This could take a while. Collecting torchvision Downloading https://download.pytorch.org/whl/cu118/torchvision-0.17.0%2Bcu118-cp310-cp310-win_amd64.whl (4.9 MB) ---------------------------------------- 4.9/4.9 MB 18.5 MB/s eta 0:00:00 Collecting torch Downloading https://download.pytorch.org/whl/cu118/torch-2.2.0%2Bcu118-cp310-cp310-win_amd64.whl (2704.3 MB) ---------------------------------------- 2.7/2.7 GB ? eta 0:00:00 Collecting typing-extensions Downloading https://download.pytorch.org/whl/typing_extensions-4.8.0-py3-none-any.whl (31 kB) Collecting sympy Using cached https://download.pytorch.org/whl/sympy-1.12-py3-none-any.whl (5.7 MB) Requirement already satisfied: filelock in c:\users\urani\appdata\local\programs\python\python310\lib\site-packages (from torch) (3.9.1) Collecting networkx Downloading https://download.pytorch.org/whl/networkx-3.2.1-py3-none-any.whl (1.6 MB) ---------------------------------------- 1.6/1.6 MB 20.7 MB/s eta 0:00:00 Collecting jinja2 Using cached https://download.pytorch.org/whl/Jinja2-3.1.2-py3-none-any.whl (133 kB) Requirement already satisfied: fsspec in c:\users\urani\appdata\local\programs\python\python310\lib\site-packages (from torch) (2023.12.2) Collecting MarkupSafe>=2.0 Using cached https://download.pytorch.org/whl/MarkupSafe-2.1.3-cp310-cp310-win_amd64.whl (17 kB) Requirement already satisfied: idna<4,>=2.5 in c:\users\urani\appdata\local\programs\python\python310\lib\site-packages (from requests->torchvision) (3.4) Requirement already satisfied: urllib3<1.27,>=1.21.1 in c:\users\urani\appdata\local\programs\python\python310\lib\site-packages (from requests->torchvision) (1.26.15) Requirement already satisfied: certifi>=2017.4.17 in c:\users\urani\appdata\local\programs\python\python310\lib\site-packages (from requests->torchvision) (2022.12.7) Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\urani\appdata\local\programs\python\python310\lib\site-packages (from requests->torchvision) (3.1.0) Collecting mpmath>=0.19 Using cached https://download.pytorch.org/whl/mpmath-1.3.0-py3-none-any.whl (536 kB) Installing collected packages: mpmath, typing-extensions, sympy, networkx, MarkupSafe, jinja2, torch, torchvision, torchaudio Attempting uninstall: typing-extensions Found existing installation: typing_extensions 4.5.0 Uninstalling typing_extensions-4.5.0: Successfully uninstalled typing_extensions-4.5.0 Attempting uninstall: torch Found existing installation: torch 1.13.1 Uninstalling torch-1.13.1: Successfully uninstalled torch-1.13.1 Attempting uninstall: torchvision Found existing installation: torchvision 0.14.1 Uninstalling torchvision-0.14.1: Successfully uninstalled torchvision-0.14.1 Successfully installed MarkupSafe-2.1.3 jinja2-3.1.2 mpmath-1.3.0 networkx-3.2.1 sympy-1.12 torch-2.2.0+cu118 torchaudio-2.2.0+cu118 torchvision-0.17.0+cu118 typing-extensions-4.8.0

[notice] A new release of pip available: 22.2.1 -> 23.3.2 [notice] To update, run: C:\Users\urani\AppData\Local\Programs\Python\Python310\python.exe -m pip install --upgrade pip

C:\Users\urani\ComfyUI_windows_portable\python_embeded> '''

LarryJane491 commented 8 months ago

Pip and CUDA 11.8 are the right options, no problem here. Are you sure it actually installed it in the embedded python folder? Can you look into that folder, go where it installed the packages, and see if you see folders with "torch" in its name? Make sure that it was modified/created when you tried that fix. If the date doesn't fit, then it was installed elsewhere.

This error when launching ComfyUI happens because Torch with CUDA is not installed, i guarantee you that. I met it several times during many tests, and all the time the answer was to install Torch with CUDA. ComfyUI itself isn't broken, don't worry ^^. It's just that it isn't able to use the GPU.

Otherwise, I can only recommend downloading a new version of ComfyUI, not the portable one, and do the process from the link I posted above:

https://www.reddit.com/r/comfyui/comments/1995whb/guide_learn_to_deal_with_python_programs/

If you don't want to move the files, there is a way to "link" the program to external folders. I'll link another github post with the solution:

https://github.com/comfyanonymous/ComfyUI/discussions/72

It will allow you to use the paths to the folders from your first ComfyUI into the second one. It's a bit of a Frankenstein mess, but it should work.

uraniumcrystalsmaster commented 8 months ago

I noticed torch was missing from my extensions folder, but not torch-2.2.0. I thought the torch folder I needed was being installed in c:\users\urani\appdata\local\programs\python\python310\lib\site-packages, but when I copy and pasted the torch folder from there to C:\Users\urani\ComfyUI_windows_portable\python_embeded it made me replace existing files in site-packages, and now Comfyui displays the same error here, but doesn't display an error: https://github.com/LarryJane491/Lora-Training-in-Comfy/issues/7#issuecomment-1920596827

I don't understand why what I think is duplicate program files are being stored in appdata, and why when installing from the pytorch website, it says it's already installed in appdata when I specified the directory as C:\Users\urani\ComfyUI_windows_portable\python_embeded One of the suggestions on this stackexchange was to enable long_paths in powershell (admin), but I tried this and it did not fix the issue. What is the command to uninstall pip?

My next plan if you can't figure out is to download ComfyUI portable again and replace the torch file first, then if that doesn't work replace all files in C:\Users\urani\ComfyUI_windows_portable\python_embeded

uraniumcrystalsmaster commented 7 months ago

Pip and CUDA 11.8 are the right options, no problem here. Are you sure it actually installed it in the embedded python folder? Can you look into that folder, go where it installed the packages, and see if you see folders with "torch" in its name? Make sure that it was modified/created when you tried that fix. If the date doesn't fit, then it was installed elsewhere.

This error when launching ComfyUI happens because Torch with CUDA is not installed, i guarantee you that. I met it several times during many tests, and all the time the answer was to install Torch with CUDA. ComfyUI itself isn't broken, don't worry ^^. It's just that it isn't able to use the GPU.

Otherwise, I can only recommend downloading a new version of ComfyUI, not the portable one, and do the process from the link I posted above:

https://www.reddit.com/r/comfyui/comments/1995whb/guide_learn_to_deal_with_python_programs/

If you don't want to move the files, there is a way to "link" the program to external folders. I'll link another github post with the solution:

comfyanonymous/ComfyUI#72

It will allow you to use the paths to the folders from your first ComfyUI into the second one. It's a bit of a Frankenstein mess, but it should work.

I can not get ComfyUI to launch no matter what I try. I tried to download Comfyui manually, but it still gives me the same error. I installed torch in the ComfyUI folder because the ComfyUI installation guide does not provide a location to install it: C:\Users\urani\ComfyUI_manual_install\ComfyUI>python main.py --force-fp16 Traceback (most recent call last): File "C:\Users\urani\ComfyUI_manual_install\ComfyUI\main.py", line 73, in import comfy.utils File "C:\Users\urani\ComfyUI_manual_install\ComfyUI\comfy\utils.py", line 1, in import torch ModuleNotFoundError: No module named 'torch'

LarryJane491 commented 7 months ago

I don't understand why what I think is duplicate program files are being stored in appdata, and why when installing from the pytorch website, it says it's already installed in appdata when I specified the directory as C:\Users\urani\ComfyUI_windows_portable\python_embeded

Welcome to Python ^^. Honestly, all I can recommend at this point is download the "default" version of ComfyUI, create a venv for it, like I showed in the guide I posted earlier. When you install dependencies in a virtual environment, you are absolutely certain of the folder it's installed in.

uraniumcrystalsmaster commented 7 months ago

I don't understand why what I think is duplicate program files are being stored in appdata, and why when installing from the pytorch website, it says it's already installed in appdata when I specified the directory as C:\Users\urani\ComfyUI_windows_portable\python_embeded

Welcome to Python ^^. Honestly, all I can recommend at this point is download the "default" version of ComfyUI, create a venv for it, like I showed in the guide I posted earlier. When you install dependencies in a virtual environment, you are absolutely certain of the folder it's installed in.

Good news! I fixed the first problem by deleting and copying over the ComfyUI portable venv directory known as the python embedded folder. I also fixed the 'torch missing' error by removing torch-2.2.0+cu118.dist-info, and reinstalling torch in the appdata directory where my python is located.

Bad news: I'm back to the original issue at the start of this post, which means the issue has not been solved with the update.

I did a git pull and it says your addon is up to date. Reinstalling the addon messes up torch with cuda in either c:\users\urani\appdata\local\programs\python\python310\lib\site-packages folder, C:\Users\urani\ComfyUI_windows_portable\python_embeded folder, or both; and that is what was giving me the Cuda missing error before. If you can, disable auto-installation of pytorch.

LarryJane491 commented 7 months ago

Be careful: embedded python is not technically a virtual environment. Installing something in the embedded python folder requires a bit more steps: you have to launch get-pip.py to install pip in that folder, then install the requirements using this line in that same folder: .\Scripts\pip install -r [path to requirements]

If you still have the error "Torch not compiled with CUDA enabled", it is still the same problem: Torch widh CUDA wasn't installed in the right folder ^^.

uraniumcrystalsmaster commented 7 months ago

I solved this issue by deleting and reinstalling torch and torch-sd 2.2.0+cu118.dist-info in both the appdata and python embedded folder. I'm concerned that the ComfyUI manager incorrectly installs this addon, which gives the torch with cuda missing error, so please warn others not to install this addon with ComfyUI manager.

Can you please create a video on how to install ComfyUI manually?

laitianwen commented 6 months ago

We've found the way to fix it in the meantime, it all was because of paths fault. You have to modify train.py and change the path to train_network.py in the end to ComfyUI/custom_nodes/Lora-Training-in-Comfy/sd-scripts/train_network.py

I see two places with paths. Can you be more specific? I don't know where to change it?