Open jamiis opened 8 years ago
You're simply running out of memory. You see Titan X has 12 GB of physical memory and 2 logical devices each 6 GB. If you run nvidia-smi you will see 2 graphic cards each with 6 GB total memory. So when you indicate gpu 0 it means you are using just 6 GB. On the other hand when you indicate image_size 512, each batch element requires around 2.2 GB of memory on the GPU. So, for b=1 it's 2.2, b=2 it's 4.4 and so on.
I hope this helps.
That definitely helps, thanks! But I have 2 Titan X's in my workstation and nvidia-smi
shows each as having 12 Gb available:
$ nvidia-smi
Tue Aug 9 11:41:32 2016
+------------------------------------------------------+
| NVIDIA-SMI 361.42 Driver Version: 361.42 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX TIT... Off | 0000:6D:00.0 Off | N/A |
| 22% 32C P8 14W / 250W | 24MiB / 12287MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX TIT... Off | 0000:6E:00.0 Off | N/A |
| 22% 43C P8 31W / 250W | 911MiB / 12285MiB | 29% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 1 4207 G /usr/lib/xorg/Xorg 332MiB |
| 1 5307 G compiz 461MiB |
| 1 10126 G ...inFlow --disable-features=DocumentWriteEv 87MiB |
+-----------------------------------------------------------------------------+
(Typically the second GPU is also sitting at 0%)
try with smaller batch size, let's say 2 and see how much memory it takes
If I remove
--image_size 512
then I no longer get this error. Running on a Titan X (12 Gb of video memory).It says that I don't have cuDNN installed but I do: