Closed ProGamerGov closed 7 years ago
I solved the issue:
I ran pip install tensorflow
by accident instead of pip install tensorflow-gpu
. Uninstalling the default CPU Tensorflow and installing the GPU version with pip install tensorflow-gpu
solved the issue.
New error:
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
---- RENDERING SINGLE IMAGE ----
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: Tesla K80
major: 3 minor: 7 memoryClockRate (GHz) 0.8235
pciBusID 0000:00:1e.0
Total memory: 11.17GiB
Free memory: 11.11GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0)
E tensorflow/stream_executor/cuda/cuda_dnn.cc:378] Loaded runtime CuDNN library: 5005 (compatibility version 5000) but source was compiled with 5105 (compatibility version 5100). If using a binary install, upgrade your CuDNN library to match. If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration.
F tensorflow/core/kernels/conv_ops.cc:532] Check failed: stream->parent()->GetConvolveAlgorithms(&algorithms)
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
---- RENDERING SINGLE IMAGE ----
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: Tesla K80
major: 3 minor: 7 memoryClockRate (GHz) 0.8235
pciBusID 0000:00:1e.0
Total memory: 11.17GiB
Free memory: 11.11GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0)
E tensorflow/stream_executor/cuda/cuda_dnn.cc:378] Loaded runtime CuDNN library: 5005 (compatibility version 5000) but source was compiled with 5105 (compatibility version 5100). If using a binary install, upgrade your CuDNN library to match. If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration.
F tensorflow/core/kernels/conv_ops.cc:532] Check failed: stream->parent()->GetConvolveAlgorithms(&algorithms)
Aborted (core dumped)
ubuntu@ip-Address:~/neural-style-tf$
Neural-Style and plenty of other programs including those that require Tensorflow, work with my installed CUDA and CuDNN.
@cysmith I tried reinstalling everything, including CUDA, an CuDNN, but I get the same error.
I am using an Amazon AMI. The exact same error occurs every time.
Edit: I think cuDNN v5 is the problem.
Hi ProGamerGov,
Hmm. I have a TensorFlow that I built from source a few months ago. Kind of busy this week but I'll look into it as soon as I can.
Cameron
Ok, so following this guide here:
seems to install correctly on Ubuntu 16.04 with Cudnn v5.1 and Cuda 8.0. But Tensorflow still cannot detect the GPU when running a simple python script like:
import tensorflow as tf
# Creates a graph.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print sess.run(c)
The output of the script:
ubuntu@ip-Address:~/neural-style-tf$ python tensor_test.py
Device mapping: no known devices.
I tensorflow/core/common_runtime/direct_session.cc:252] Device mapping:
MatMul: /job:localhost/replica:0/task:0/cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:819] MatMul: /job:localhost/replica:0/task:0/cpu:0
b: /job:localhost/replica:0/task:0/cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:819] b: /job:localhost/replica:0/task:0/cpu:0
a: /job:localhost/replica:0/task:0/cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:819] a: /job:localhost/replica:0/task:0/cpu:0
[[ 22. 28.]
[ 49. 64.]]
ubuntu@ip-Address:~/neural-style-tf$
This issue here: https://github.com/tensorflow/tensorflow/issues/1066 seems to detail a compiler issue that might be a role, or maybe it's something like a bad install?
Checking my g++ version:
ubuntu@ip-Address:~$ g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 5.4.0-6ubuntu1~16.04.4' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)
ubuntu@ip-Address:~$
And gcc version:
ubuntu@ip-Address:~$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 5.4.0-6ubuntu1~16.04.4' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)
ubuntu@ip-Address:~$
Not sure if the issue I linked was related.
Output of nvidia-smi:
ubuntu@ip-Address:~$ nvidia-smi
Thu Jan 5 05:34:41 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.57 Driver Version: 367.57 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 0000:00:1E.0 Off | 0 |
| N/A 23C P0 72W / 149W | 0MiB / 11439MiB | 100% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
ubuntu@ip-Address:~$
The output of nvcc --version:
ubuntu@ip-Address:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2015 NVIDIA Corporation
Built on Tue_Aug_11_14:27:32_CDT_2015
Cuda compilation tools, release 7.5, V7.5.17
ubuntu@ip-Address:~$
Exact Ubuntu Version:
Ubuntu 16.04.1 LTS (GNU/Linux 4.4.0-38-generic x86_64)
My output of nvcc --version is more recent.
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Wed_May__4_21:01:56_CDT_2016
Cuda compilation tools, release 8.0, V8.0.26
I don't recall ever installing Cuda 7.5 on my current AMI, and even if I somehow accidentally installed it, shouldn't it have been purged when I purged everything Nvidia to start over again?
I guess I need to start from scratch to try again? Is installing everything on Ubuntu 16.04 still as painful as it was when CUDA 8.0RC was the latest version back in July-October of 2016? Because I had to use various g++ and gcc compiler tricks so that stuff would even compile.
Yeah I would purge and get CUDA 8.0. I don't know but I understand your pain. Let me know if that is the issue.
So I started with a fresh install of Ubuntu 16.04 LTS on AWS and followed the guide here: https://alliseesolutions.wordpress.com/2016/09/08/install-gpu-tensorflow-from-sources-w-ubuntu-16-04-and-cuda-8-0-rc/
I used the latest CuDNN v5.1 and CUDA 8.0, and Tensorflow seems to have work properly now.
sudo apt-get remove --purge nvidia-*
https://alliseesolutions.wordpress.com/2016/09/08/install-gpu-tensorflow-from-sources-w-ubuntu-16-04-and-cuda-8-0-rc/
Installation Instructions:
`sudo dpkg -i cuda-repo-ubuntu1604_8.0.44-1_amd64.deb`
`sudo apt-get update`
`sudo apt-get install cuda`
sudo tar -xzvf cudnn-8.0-linux-x64-v5.1.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
source ~/.bashrc
echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
curl https://storage.googleapis.com/bazel-apt/doc/apt-key.pub.gpg | sudo apt-key add -
sudo apt-get update
sudo apt-get install bazel
sudo apt-get upgrade bazel
sudo reboot
cd ~
git clone https://github.com/tensorflow/tensorflow
cd ~/tensorflow
./configure
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
sudo pip install /tmp/tensorflow_pkg/tensorflow
sudo pip install /tmp/tensorflow_pkg/tensorflow-0.12.1-cp27-cp27mu-linux_x86_64.whl
Then I ran the script: https://gist.github.com/ProGamerGov/b1550e5f6b7bf6e032378597262b88d8
ubuntu@ip-Address:~$ python tensor_test.py
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcurand.so.8.0 locally
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:910] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: Tesla K80
major: 3 minor: 7 memoryClockRate (GHz) 0.8235
pciBusID 0000:00:1e.0
Total memory: 11.17GiB
Free memory: 11.11GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0)
Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0
I tensorflow/core/common_runtime/direct_session.cc:256] Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0
MatMul: (MatMul): /job:localhost/replica:0/task:0/gpu:0
I tensorflow/core/common_runtime/simple_placer.cc:827] MatMul: (MatMul)/job:localhost/replica:0/task:0/gpu:0
b: (Const): /job:localhost/replica:0/task:0/gpu:0
I tensorflow/core/common_runtime/simple_placer.cc:827] b: (Const)/job:localhost/replica:0/task:0/gpu:0
a: (Const): /job:localhost/replica:0/task:0/gpu:0
I tensorflow/core/common_runtime/simple_placer.cc:827] a: (Const)/job:localhost/replica:0/task:0/gpu:0
[[ 22. 28.]
[ 49. 64.]]
ubuntu@ip-Address:~$
And now Tensorflow sees the GPU instead of just the CPU!
I think it should be working now?
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcurand.so.8.0 locally
---- RENDERING SINGLE IMAGE ----
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:910] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: Tesla K80
major: 3 minor: 7 memoryClockRate (GHz) 0.8235
pciBusID 0000:00:1e.0
Total memory: 11.17GiB
Free memory: 11.11GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0)
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:628: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Use `tf.global_variables_initializer` instead.
Though nvidia-smi shows the same usage values regardless of settings:
ubuntu@ip-Address:~$ nvidia-smi
Thu Jan 5 23:59:42 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.57 Driver Version: 367.57 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 0000:00:1E.0 Off | 0 |
| N/A 46C P0 145W / 149W | 10941MiB / 11439MiB | 98% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 2440 C python 10937MiB |
+-----------------------------------------------------------------------------+
ubuntu@ip-Address:~$
Edit: nm it seems to work:
Single image elapsed time: 91.7761621475
But the --print_iterations 50
parameter does nothing. And it would be nice to have an equivalent of Neural-Style's -save_iter
command. Also, what is the equivalent of -normalize_gradients
?
Hi,
You need to include the --verbose
flag for printing iterations. I agree but unfortunately, I cannot have the --save_iter
functionality with LBFGS so I chose to remove it.
Cam
I was just about to say that --max_size
wasn't workings as well, but --verbose
seems to have fixed that.
I don't think any of the commands are working. Even manually changing the neural_style.py seemed to have no effect.
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcurand.so.8.0 locally
---- RENDERING SINGLE IMAGE ----
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:910] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: Tesla K80
major: 3 minor: 7 memoryClockRate (GHz) 0.8235
pciBusID 0000:00:1e.0
Total memory: 11.17GiB
Free memory: 11.11GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0)
BUILDING VGG-19 NETWORK
loading model weights...
constructing layers...
LAYER GROUP 1
--conv1_1 | shape=(1, 400, 400, 64) | weights_shape=(3, 3, 3, 64)
--relu1_1 | shape=(1, 400, 400, 64) | bias_shape=(64,)
--conv1_2 | shape=(1, 400, 400, 64) | weights_shape=(3, 3, 64, 64)
--relu1_2 | shape=(1, 400, 400, 64) | bias_shape=(64,)
--pool1 | shape=(1, 200, 200, 64)
LAYER GROUP 2
--conv2_1 | shape=(1, 200, 200, 128) | weights_shape=(3, 3, 64, 128)
--relu2_1 | shape=(1, 200, 200, 128) | bias_shape=(128,)
--conv2_2 | shape=(1, 200, 200, 128) | weights_shape=(3, 3, 128, 128)
--relu2_2 | shape=(1, 200, 200, 128) | bias_shape=(128,)
--pool2 | shape=(1, 100, 100, 128)
LAYER GROUP 3
--conv3_1 | shape=(1, 100, 100, 256) | weights_shape=(3, 3, 128, 256)
--relu3_1 | shape=(1, 100, 100, 256) | bias_shape=(256,)
--conv3_2 | shape=(1, 100, 100, 256) | weights_shape=(3, 3, 256, 256)
--relu3_2 | shape=(1, 100, 100, 256) | bias_shape=(256,)
--conv3_3 | shape=(1, 100, 100, 256) | weights_shape=(3, 3, 256, 256)
--relu3_3 | shape=(1, 100, 100, 256) | bias_shape=(256,)
--conv3_4 | shape=(1, 100, 100, 256) | weights_shape=(3, 3, 256, 256)
--relu3_4 | shape=(1, 100, 100, 256) | bias_shape=(256,)
--pool3 | shape=(1, 50, 50, 256)
LAYER GROUP 4
--conv4_1 | shape=(1, 50, 50, 512) | weights_shape=(3, 3, 256, 512)
--relu4_1 | shape=(1, 50, 50, 512) | bias_shape=(512,)
--conv4_2 | shape=(1, 50, 50, 512) | weights_shape=(3, 3, 512, 512)
--relu4_2 | shape=(1, 50, 50, 512) | bias_shape=(512,)
--conv4_3 | shape=(1, 50, 50, 512) | weights_shape=(3, 3, 512, 512)
--relu4_3 | shape=(1, 50, 50, 512) | bias_shape=(512,)
--conv4_4 | shape=(1, 50, 50, 512) | weights_shape=(3, 3, 512, 512)
--relu4_4 | shape=(1, 50, 50, 512) | bias_shape=(512,)
--pool4 | shape=(1, 25, 25, 512)
LAYER GROUP 5
--conv5_1 | shape=(1, 25, 25, 512) | weights_shape=(3, 3, 512, 512)
--relu5_1 | shape=(1, 25, 25, 512) | bias_shape=(512,)
--conv5_2 | shape=(1, 25, 25, 512) | weights_shape=(3, 3, 512, 512)
--relu5_2 | shape=(1, 25, 25, 512) | bias_shape=(512,)
--conv5_3 | shape=(1, 25, 25, 512) | weights_shape=(3, 3, 512, 512)
--relu5_3 | shape=(1, 25, 25, 512) | bias_shape=(512,)
--conv5_4 | shape=(1, 25, 25, 512) | weights_shape=(3, 3, 512, 512)
--relu5_4 | shape=(1, 25, 25, 512) | bias_shape=(512,)
--pool5 | shape=(1, 13, 13, 512)
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
MINIMIZING LOSS USING: ADAM OPTIMIZER
WARNING:tensorflow:From neural_style.py:628: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Use `tf.global_variables_initializer` instead.
At iterate 0 f= 5.04182E+10
At iterate 50 f= 8.23411E+09
At iterate 100 f= 6.94835E+09
At iterate 150 f= 6.53249E+09
At iterate 200 f= 6.31361E+09
At iterate 250 f= 6.16693E+09
At iterate 300 f= 6.06346E+09
At iterate 350 f= 5.98807E+09
At iterate 400 f= 5.93253E+09
At iterate 450 f= 5.88094E+09
At iterate 500 f= 5.84276E+09
At iterate 550 f= 5.81210E+09
At iterate 600 f= 5.78642E+09
At iterate 650 f= 5.77015E+09
At iterate 700 f= 5.75785E+09
At iterate 750 f= 5.77885E+09
At iterate 800 f= 5.77774E+09
At iterate 850 f= 5.73198E+09
At iterate 900 f= 5.70126E+09
At iterate 950 f= 5.69114E+09
Single image elapsed time: 329.488039017
Ok. You are using a more recent Tensorflow (0.12). How did you manually change the values? I don't understand what you mean because Adam is not a default and masking is not a default.
Here's the output when using L-BFGS:
ubuntu@ip-Address:~/neural-style-tf$ python neural_style.py --content_img content.jpg --style_imgs style.jpg --verbose
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcurand.so.8.0 locally
---- RENDERING SINGLE IMAGE ----
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:910] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: Tesla K80
major: 3 minor: 7 memoryClockRate (GHz) 0.8235
pciBusID 0000:00:1e.0
Total memory: 11.17GiB
Free memory: 11.11GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0)
BUILDING VGG-19 NETWORK
loading model weights...
constructing layers...
LAYER GROUP 1
--conv1_1 | shape=(1, 400, 400, 64) | weights_shape=(3, 3, 3, 64)
--relu1_1 | shape=(1, 400, 400, 64) | bias_shape=(64,)
--conv1_2 | shape=(1, 400, 400, 64) | weights_shape=(3, 3, 64, 64)
--relu1_2 | shape=(1, 400, 400, 64) | bias_shape=(64,)
--pool1 | shape=(1, 200, 200, 64)
LAYER GROUP 2
--conv2_1 | shape=(1, 200, 200, 128) | weights_shape=(3, 3, 64, 128)
--relu2_1 | shape=(1, 200, 200, 128) | bias_shape=(128,)
--conv2_2 | shape=(1, 200, 200, 128) | weights_shape=(3, 3, 128, 128)
--relu2_2 | shape=(1, 200, 200, 128) | bias_shape=(128,)
--pool2 | shape=(1, 100, 100, 128)
LAYER GROUP 3
--conv3_1 | shape=(1, 100, 100, 256) | weights_shape=(3, 3, 128, 256)
--relu3_1 | shape=(1, 100, 100, 256) | bias_shape=(256,)
--conv3_2 | shape=(1, 100, 100, 256) | weights_shape=(3, 3, 256, 256)
--relu3_2 | shape=(1, 100, 100, 256) | bias_shape=(256,)
--conv3_3 | shape=(1, 100, 100, 256) | weights_shape=(3, 3, 256, 256)
--relu3_3 | shape=(1, 100, 100, 256) | bias_shape=(256,)
--conv3_4 | shape=(1, 100, 100, 256) | weights_shape=(3, 3, 256, 256)
--relu3_4 | shape=(1, 100, 100, 256) | bias_shape=(256,)
--pool3 | shape=(1, 50, 50, 256)
LAYER GROUP 4
--conv4_1 | shape=(1, 50, 50, 512) | weights_shape=(3, 3, 256, 512)
--relu4_1 | shape=(1, 50, 50, 512) | bias_shape=(512,)
--conv4_2 | shape=(1, 50, 50, 512) | weights_shape=(3, 3, 512, 512)
--relu4_2 | shape=(1, 50, 50, 512) | bias_shape=(512,)
--conv4_3 | shape=(1, 50, 50, 512) | weights_shape=(3, 3, 512, 512)
--relu4_3 | shape=(1, 50, 50, 512) | bias_shape=(512,)
--conv4_4 | shape=(1, 50, 50, 512) | weights_shape=(3, 3, 512, 512)
--relu4_4 | shape=(1, 50, 50, 512) | bias_shape=(512,)
--pool4 | shape=(1, 25, 25, 512)
LAYER GROUP 5
--conv5_1 | shape=(1, 25, 25, 512) | weights_shape=(3, 3, 512, 512)
--relu5_1 | shape=(1, 25, 25, 512) | bias_shape=(512,)
--conv5_2 | shape=(1, 25, 25, 512) | weights_shape=(3, 3, 512, 512)
--relu5_2 | shape=(1, 25, 25, 512) | bias_shape=(512,)
--conv5_3 | shape=(1, 25, 25, 512) | weights_shape=(3, 3, 512, 512)
--relu5_3 | shape=(1, 25, 25, 512) | bias_shape=(512,)
--conv5_4 | shape=(1, 25, 25, 512) | weights_shape=(3, 3, 512, 512)
--relu5_4 | shape=(1, 25, 25, 512) | bias_shape=(512,)
--pool5 | shape=(1, 13, 13, 512)
MINIMIZING LOSS USING: L-BFGS OPTIMIZER
WARNING:tensorflow:From neural_style.py:620: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Use `tf.global_variables_initializer` instead.
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 480000 M = 10
This problem is unconstrained.
At X0 0 variables are exactly at the bounds
At iterate 0 f= 7.85852D+11 |proj g|= 3.89032D+06
It's missing the error messages that using ADAM causes:
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
These error messages say that things will/have been removed?
Line 383-384 in neural_style.py are:
mask = tf.pack(tensors, axis=2)
mask = tf.pack(mask, axis=0)
Which are part of the following function:
def mask_style_layer(a, x, mask_img):
_, h, w, d = a.get_shape()
mask = get_mask_image(mask_img, w.value, h.value)
mask = tf.convert_to_tensor(mask)
tensors = []
for _ in range(d.value):
tensors.append(mask)
**mask = tf.pack(tensors, axis=2)
mask = tf.pack(mask, axis=0)**
mask = tf.expand_dims(mask, 0)
a = tf.mul(a, mask)
x = tf.mul(x, mask)
Both ADAM and L-BFGS have the issue, so it looks related to the masked style transfer feature. We can also see that something is clearly wrong by observing the loss values where I was using both ADAM and masked style transfer.
At iterate 0 f= 5.04182E+10
At iterate 50 f= 8.23411E+09
At iterate 100 f= 6.94835E+09
At iterate 150 f= 6.53249E+09
At iterate 200 f= 6.31361E+09
At iterate 250 f= 6.16693E+09
At iterate 300 f= 6.06346E+09
At iterate 350 f= 5.98807E+09
At iterate 400 f= 5.93253E+09
At iterate 450 f= 5.88094E+09
At iterate 500 f= 5.84276E+09
At iterate 550 f= 5.81210E+09
At iterate 600 f= 5.78642E+09
At iterate 650 f= 5.77015E+09
At iterate 700 f= 5.75785E+09
At iterate 750 f= 5.77885E+09
At iterate 800 f= 5.77774E+09
At iterate 850 f= 5.73198E+09
At iterate 900 f= 5.70126E+09
At iterate 950 f= 5.69114E+09
Without the --style_mask
& --style_mask_imgs
commands, the loss falls lower as expected:
At iterate 0 f= 4.22601E+11
At iterate 50 f= 3.43817E+10
At iterate 100 f= 2.53430E+10
At iterate 150 f= 2.26230E+10
At iterate 200 f= 2.12251E+10
At iterate 250 f= 2.03405E+10
At iterate 300 f= 1.97152E+10
At iterate 350 f= 1.92445E+10
At iterate 400 f= 1.88709E+10
At iterate 450 f= 1.85676E+10
At iterate 500 f= 1.83171E+10
At iterate 550 f= 1.81063E+10
At iterate 600 f= 1.79274E+10
At iterate 650 f= 1.77726E+10
At iterate 700 f= 1.76376E+10
At iterate 750 f= 1.75175E+10
At iterate 800 f= 1.74115E+10
At iterate 850 f= 1.73161E+10
At iterate 900 f= 1.72311E+10
At iterate 950 f= 1.71539E+10
Single image elapsed time: 325.5970788
I believe I fixed the issue in the pull request here: https://github.com/cysmith/neural-style-tf/pull/19