linyq2117 / CLIP-ES

MIT License
175 stars 9 forks source link

Running Problems #13

Closed HYTHYThythyt closed 10 months ago

HYTHYThythyt commented 10 months ago

hello, I have problems when I running the codes, just like this "RuntimeError: CUDA error: out of memory CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.", I have switched many torch versions, if could you tell me what is your environments in your experiments. Thank you very much, hoping you have a good mood.

linyq2117 commented 10 months ago

Hi, thanks for your interest.

It seems that your CUDA memory is not enough. You can check the free memory of the GPU you used. Besides, you can follow the default command (--num_workers 1) in README.

HYTHYThythyt commented 10 months ago

Thank you for your answer. I have check my GPU memory, is 48G, and I follow your recommend that the --num_workers=1, but it always happens when I running the codes in A6000 server. And if could you tell me what is your experiments' environment? It is very importance for me to check the A6000 if could running you work. Thank you!

linyq2117 commented 10 months ago

The initial environment I used is shown in the Requirement Section in README (run on TITAN RTX). I have also successfully run this code on A100 and 3090 with pytorch 1.9.0+cu111.
Another potential reason is that you set CUDA_VISIBLE_DEVICES=0 in the command but the left memory of device0 is not enough (maybe some other codes are running on it).

HYTHYThythyt commented 10 months ago

You right, there are some other codes is running on GPU0, but also leaved 40G free memory on GPU0. Could you tell me how much memory is occupied when you running CLIP-ES. Thank you for you answers.

linyq2117 commented 10 months ago

It's about 3G.

HYTHYThythyt commented 10 months ago

OK, thank you for your help.