Naver-AI-Hackathon / cs492I

2 stars 0 forks source link

Wrong GPU count #52

Closed seonghyeonye closed 3 years ago

seonghyeonye commented 3 years ago

For our class, the number of GPU is limited to 8 per team. We are currently running 3 sessions with 2 gpus for each session. (total expected gpu : 6) However, when we try to run another session with 2gpus, it says that already 7 gpus have been used. We cannot know where this one additional gpu usage came from? Could it be checked? Thank you. image

frednam93 commented 3 years ago

I have the same problem. I am currently running 6 sessions with 1 GPU each, and the GPU usage is reported as 8.

image

The same thing happened last week when i used only 7 GPUs, the system said 8 are being used. Could you please check if this problem can be fixed? Thank you.

bluebrush commented 3 years ago

I deleted several zombie sessions below. If you check again, you will be able to check the normal number of GPUs. Deleted sessions are as follows.

nsml rm -f kaist002/korquad-open-ldbd3/126 
Done
nsml rm -f kaist006/korquad-open-ldbd3/67
Done
nsml rm -f kaist0013/korquad-open-ldbd3/46
Done
nsml rm -f kaist007/korquad-open-ldbd3/154
Done
nsml rm -f kaist0013/korquad-open-ldbd3/84
Done