Closed DaRinker closed 1 year ago
I may have solved this. Originally I built my plmblast environment on a cpu node.
This time, I rebuilt my conda environment while logged onto a gpu node... so, when far using this new environment, the above command seems to be working correctly
(While this doesn't 100% make sense to me (?), there are many aspects of working on a shared cluster that don't)
Command I'm trying to run is:
python $PLMBLAST_PATH/embeddings.py merged.proteomes.csv merged.proteomes -embedder pt -cname sequence --gpu -bs 0 --asdir
Error I'm getting:
Traceback (most recent call last): File "/bin/pLM-BLAST/embeddings.py", line 13, in <module> df = validate_args(args, verbose=True) File "/bin/pLM-BLAST/embedders/parser.py", line 118, in validate_args raise ValueError('gpu is not available, but device is set to gpu and what now?') ValueError: gpu is not available, but device is set to gpu and what now?
Node info (NVIDIA GPU):
NodeName=gpu0042 Arch=x86_64 CoresPerSocket=12 CPUAlloc=5 CPUEfctv=24 CPUTot=24 CPULoad=3.14 Gres=gpu:4 NodeAddr=gpu0042 NodeHostName=gpu0042 Version=22.05.7 OS=Linux 3.10.0-1160.71.1.el7.x86_64 #1 SMP Tue Jun 28 15:37:28 UTC 2022 RealMemory=385400 AllocMem=49152 FreeMem=325347 Sockets=2 Boards=1 MemSpecLimit=5120 State=MIXED ThreadsPerCore=2 TmpDisk=0 Weight=50 Owner=N/A MCS_label=N/A Partitions=turing BootTime=2023-08-10T09:54:27 SlurmdStartTime=2023-09-11T19:07:47 LastBusyTime=2023-09-11T19:07:05 CfgTRES=cpu=24,mem=385400M,billing=24,gres/gpu=4 AllocTRES=cpu=5,mem=48G,gres/gpu=4 CapWatts=n/a CurrentWatts=0 AveWatts=0 ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
EDIT: I tried both the "gpu" and the "cuda" options. Neither seem to recognize my GPU