Closed zhenzhenyang-psu closed 5 years ago
it turns out that by using "nvidia-smi" for a qsub job is not correct. As I didn't run it local, this cannot be reflected properly. The administrator checked my job with "nvidia-smi" and showed that GPU Util is good.
ok, I close the issue then
Hello Nicolas, I am running training with tensorflow-gpu. I think there is no problem in my gpu setup, as can be seen from the first attachment. However, while using nvidia-smi to show the gpu usage, it shows 0 (see the second attachment). Does that mean that gpu is not utilized? I also see the following post: https://stackoverflow.com/questions/56271551/tensorflow-not-utilizing-gpu
The following is my command:
module load compiler/cuda/7/9.0 export CUDA_VISIBLE_DEVICES=0
~/tools/DeepPATH-master/DeepPATH_code/01_training/xClasses/bazel-bin/inception/imagenet_train --num_gpus=1 --batch_size=64 --train_dir='/public/home/yangzhzh/projects/imaging_bladder/3_training/r1_results' --data_dir='/public/home/yangzhzh/projects/imaging_bladder/2_preprocessing/3_convert2TFRecord/r1_TFRecord_train' --ClassNumber=2 --mode='0_softmax' --NbrOfImages=173000 --save_step_for_chekcpoint=2300 --max_steps=230001
Looking forward to your reply. Does this matter? I am concerned that this is unexpected. If you could share me your thoughts, that would be great!
Thanks, Zhenzhen