I am running Linux machine with GTX1070, CUDA9.0 and linux414-nvidia 1:390.25-9 package.
Identical model in Tensorflow runs just fine on GPU:
2018-03-15 15:02:56.178412: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-03-15 15:02:56.291383: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:895] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-03-15 15:02:56.291661: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties:
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.695
pciBusID: 0000:01:00.0
totalMemory: 7.92GiB freeMemory: 7.60GiB
2018-03-15 15:02:56.291676: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1)
>>> Start test loss: 320.5631351470947
Epoch: 1
> lr update: 0.0497500005
> Train loss: 54.85749292001128
> Valid loss: 7.513679893687367
> Best valid loss so far: 320.5631351470947
> Stopping in (35) epochs if no new minima!
! New local minima found, saving the model...
Figured it out. I have been debugging the model and inserted some Padding into DataLoader class which then I successfully forgot about :man_facepalming:
Just a few days ago I was able to train and run my model on GPU. After recent update I am getting the following error:
I am running Linux machine with GTX1070, CUDA9.0 and linux414-nvidia 1:390.25-9 package. Identical model in Tensorflow runs just fine on GPU: