ncoudray / DeepPATH

Classification of Lung cancer slide images using deep-learning
492 stars 213 forks source link

GPU Util is 0 while training with tensorflow-gpu #50

Closed zhenzhenyang-psu closed 5 years ago

zhenzhenyang-psu commented 5 years ago

Hello Nicolas, I am running training with tensorflow-gpu. I think there is no problem in my gpu setup, as can be seen from the first attachment. However, while using nvidia-smi to show the gpu usage, it shows 0 (see the second attachment). Does that mean that gpu is not utilized? I also see the following post: https://stackoverflow.com/questions/56271551/tensorflow-not-utilizing-gpu Screen Shot 2019-09-22 at 11 51 53 AM

Screen Shot 2019-09-22 at 11 52 55 AM

The following is my command:

module load compiler/cuda/7/9.0 export CUDA_VISIBLE_DEVICES=0

~/tools/DeepPATH-master/DeepPATH_code/01_training/xClasses/bazel-bin/inception/imagenet_train --num_gpus=1 --batch_size=64 --train_dir='/public/home/yangzhzh/projects/imaging_bladder/3_training/r1_results' --data_dir='/public/home/yangzhzh/projects/imaging_bladder/2_preprocessing/3_convert2TFRecord/r1_TFRecord_train' --ClassNumber=2 --mode='0_softmax' --NbrOfImages=173000 --save_step_for_chekcpoint=2300 --max_steps=230001

Looking forward to your reply. Does this matter? I am concerned that this is unexpected. If you could share me your thoughts, that would be great!

Thanks, Zhenzhen

zhenzhenyang-psu commented 5 years ago

it turns out that by using "nvidia-smi" for a qsub job is not correct. As I didn't run it local, this cannot be reflected properly. The administrator checked my job with "nvidia-smi" and showed that GPU Util is good.

ncoudray commented 5 years ago

ok, I close the issue then