Does anyone succesfully used one gpu with docker?

Using the Dockerfile in scripts folder I managed to start train1.py

[1010 11:14:00 @base.py:158] Setup callbacks graph ...
[1010 11:14:00 @summary.py:34] Maintain moving average summary of 0 tensors.
[1010 11:14:02 @base.py:174] Creating the session ...
2019-10-10 11:14:02.528831: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-10-10 11:14:02.660072: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-10-10 11:14:02.661434: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties: 
name: TITAN X (Pascal) major: 6 minor: 1 memoryClockRate(GHz): 1.531
pciBusID: 0000:01:00.0
totalMemory: 11.91GiB freeMemory: 11.29GiB
2019-10-10 11:14:02.661929: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2019-10-10 11:14:03.584727: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-10-10 11:14:03.584761: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929]      0 
2019-10-10 11:14:03.584779: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0:   N 
2019-10-10 11:14:03.585635: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 12072 MB memory) -> physical GPU (device: 0, name: TITAN X (Pascal), pci bus id: 0000:01:00.0, compute capability: 6.1)
[1010 11:14:04 @base.py:182] Initializing the session ...
[1010 11:14:04 @base.py:189] Graph Finalized.
2019-10-10 11:14:06.635234: W tensorflow/core/kernels/queue_base.cc:285] _0_QueueInput/input_queue: Skipping cancelled dequeue attempt with queue not closed
[1010 11:14:06 @concurrency.py:36] Starting EnqueueThread QueueInput/input_queue ...
[1010 11:14:06 @graph.py:70] Running Op sync_variables_from_main_tower ...
[1010 11:14:07 @base.py:209] Start Epoch 1 ...
 12%|########2                                                            |12/100[00:32<01:59, 0.73it/s]

However, you can see how it's extremely slow. Even though my GPU is recognised and the memory allocated, this is the actual usage from nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.14       Driver Version: 430.14       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  TITAN X (Pascal)    Off  | 00000000:01:00.0  On |                  N/A |
| 26%   49C    P2    55W / 250W |   1884MiB / 12194MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

ideas?

andabi / deep-voice-conversion

Does anyone succesfully used one gpu with docker? #119