Zardinality / WGAN-tensorflow

a tensorflow implementation of WGAN
579 stars 201 forks source link

deviece = '/gpu:0' doesn't work #11

Open Arthurzhangsheng opened 6 years ago

Arthurzhangsheng commented 6 years ago

我笔记本双显卡(集显+GTX980M),运行WGAN.ipynb时,采用默认的参数(deviece = '/gpu:0'),但运行时gpu并没有工作,只有cpu在满负荷跑,请问曹大大,会是哪里出了问题?跑其他深度学习代码GPU正常工作。

I run WGAN.ipynb on my laptop, and use the default config, (device = '/gpu:0'),but my GPU GTX980M doesn't work, only the CPU is running. Does anyone know what‘s wrong? My GPU is working correctly on other CNN code.

Zardinality commented 6 years ago

@Arthurzhangsheng How about removing os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID" # see issue #152 os.environ["CUDA_VISIBLE_DEVICES"]="0" from the beginning of WGAN.ipynb?

Arthurzhangsheng commented 6 years ago

I tried,but still the same o(╥﹏╥)o 鲁大师 shows: cpu 98% gpu 0%

Zardinality commented 6 years ago

@Arthurzhangsheng Sorry, I am not sure what happened. What does it show about device placement in the output log of jupyter notebook? What is the output of nvidia-smi?

Arthurzhangsheng commented 6 years ago

when running the code by jupyter notebook,it only shows the below sentences:

Extracting MNIST_data\train-images-idx3-ubyte.gz Extracting MNIST_data\train-labels-idx1-ubyte.gz Extracting MNIST_data\t10k-images-idx3-ubyte.gz Extracting MNIST_data\t10k-labels-idx1-ubyte.gz generator/Conv2d_transpose_3/Tanh:0

and when running the code, nvidia-smi shows below:

C:\Program Files\NVIDIA Corporation\NVSMI>nvidia-smi Thu Feb 22 10:24:27 2018 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 382.05 Driver Version: 382.05 | |-------------------------------+----------------------+----------------------+ | GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 980M WDDM | 0000:01:00.0 Off | N/A | | N/A 36C P8 6W / N/A | 32MiB / 4096MiB | 0% Default | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 500 C+G Insufficient Permissions N/A | | 0 4264 C+G ....1301.0_x648wekyb3d8bbwe\Calculator.exe N/A | | 0 5312 C+G ...8wekyb3d8bbwe\Microsoft.StickyNotes.exe N/A | | 0 13344 C+G C:\Program Files (x86)\Hotkey\HkeyTray.exe N/A | +-----------------------------------------------------------------------------+`

Zardinality commented 6 years ago

@Arthurzhangsheng Sorry I can not reproduce your bug since I do not own a windows PC with gpu. My test on a Linux machine with Quadro M5000 and tensorflow-gpu==1.4.1 works just fine. May I affirm one more time, did you install tensorflow with GPU support or not? Because if you install with command pip install tensorflow, it runs exclusively on cpu only by default.

Arthurzhangsheng commented 6 years ago

Sorry to borther you again,and another question: when the code ended? I run about 1 hour, still the following sentences:

Extracting MNIST_data\train-images-idx3-ubyte.gz Extracting MNIST_data\train-labels-idx1-ubyte.gz Extracting MNIST_data\t10k-images-idx3-ubyte.gz Extracting MNIST_data\t10k-labels-idx1-ubyte.gz generator/Conv2d_transpose_3/Tanh:0

And cann't find any ckpt file in the folder

Zardinality commented 6 years ago

@Arthurzhangsheng It is located in new created ./log_wgan folder, and you can specify it at the beginning of the notebook. If you cannot find any ckpt file, you are not really running it at all, at least still looping in the first 99 steps.