karaposu / comfyui-on-cloud

78 stars 24 forks source link

Gcloud does not use GPU #3

Open suanow opened 3 weeks ago

suanow commented 3 weeks ago

Hi! First of all, thanks for the repo, it helped me a lot. I followed all the steps and managed to run comfyui on gcloud, but when i use it, nvidia-smi shows, that process is running on Type="C" (CPU). Is there any chance you know how to fix it? I tried to this 2 solutions but they does not seem to work for me:

  1. Go to ComfyUI/comfy/cli_args.py, search for "--cuda-device" and replace "default=None" to "default=0" (my device is is 0)
  2. Add --cuda-device 0 after main.py in '.\python_embeded\python.exe -s ComfyUI\main.py --windows-standalone-build'

thanks in advance!

karaposu commented 3 weeks ago

Hi @suanow, Lets check these:

  1. is cuda is visible to pytorch?
  2. is gpu access taken away by google due to rush hours?

Lets check 1 first. Go to any directory and run this line pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu118 and then these lines nano test_cuda.py (this will open editor, you should paste below code inside, and press control and x to save and exit and then y to confirm the saving operation .)

import torch
print(torch.cuda.is_available()
print(torch.cuda.get_device_name(0)

and then lets run our test_cuda file with

python test_cuda.py

suanow commented 2 weeks ago

Hey! Sorry for late response. I've tested this and it shows that gpu is available. Prints are following: True Tesla T4

Also when i try to run python main.py from ComfyUI dit i get this Set cuda device to: 0. IG it starts the process using GPU (because of Total VRAM 14931 MB, total RAM 7177 MB), but when I tried to ssh into localhost using gcloud compute ssh --project [PROJECT_ID] --zone [ZONE] [VM_NAME] -- -L 7860:localhost:7860 (not sure that this is right but comfyui opened) nothing changed

And finally, what might also help to understand the problem: the initial problem is that when I run large models (IDM-VTON in my case, Load IDM-VTON Pipeline node), I get a "Reconnecting" window, which drops the process after reloading.

GPU-server commented 6 days ago

Is it possible that using IDM vton takes a lot of VRAM? So when it reach 16 VRAm it sais it has no longer vram available thus the error messages related to cuda? Try a normal wokrflow and see if you don't get the same error. Also @karaposu is there a way to run something that shows the gpu consumtion inside that CLI? something similar to the one windows has (task manager > performance)

karaposu commented 6 days ago

yup, vton is quite GPU hungry.
you can run nvidia-smiand see the consumption. If you want watch it constantly do nvidia-smi -l 1