Error when trying to use GPU

ProGamerGov commented 7 years ago


---- RENDERING SINGLE IMAGE ----

Traceback (most recent call last):
  File "neural_style.py", line 876, in <module>
    main()
  File "neural_style.py", line 873, in main
    else: render_single_image()
  File "neural_style.py", line 842, in render_single_image
    stylize(content_img, style_imgs, init_img)
  File "neural_style.py", line 574, in stylize
    L_style = sum_masked_style_losses(sess, net, style_imgs)
  File "neural_style.py", line 395, in sum_masked_style_losses
    sess.run(net['input'].assign(img))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 717, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 915, in _run
    feed_dict_string, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 965, in _do_run
    target_list, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 985, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.InvalidArgumentError: Cannot assign a device to node 'Variable': Could not satisfy explicit device specification '/device:GPU:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0
Colocation Debug Info:
Colocation group had the following types and devices:
Identity: CPU
Assign: CPU
Variable: CPU
         [[Node: Variable = Variable[container="", dtype=DT_FLOAT, shape=[1,400,400,3], shared_name="", _device="/device:GPU:0"]()]]

Caused by op u'Variable', defined at:
  File "neural_style.py", line 876, in <module>
    main()
  File "neural_style.py", line 873, in main
    else: render_single_image()
  File "neural_style.py", line 842, in render_single_image
    stylize(content_img, style_imgs, init_img)
  File "neural_style.py", line 570, in stylize
    net = build_vgg19(content_img)
  File "neural_style.py", line 245, in build_vgg19
    net['input']   = tf.Variable(np.zeros((1, h, w, d), dtype=np.float32))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variables.py", line 215, in __init__
    dtype=dtype)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variables.py", line 300, in _init_from_args
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/state_ops.py", line 146, in variable_op
    container=container, shared_name=shared_name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_state_ops.py", line 490, in _variable
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 749, in apply_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2380, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1298, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Cannot assign a device to node 'Variable': Could not satisfy explicit device specification '/device:GPU:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0
Colocation Debug Info:
Colocation group had the following types and devices:
Identity: CPU
Assign: CPU
Variable: CPU
         [[Node: Variable = Variable[container="", dtype=DT_FLOAT, shape=[1,400,400,3], shared_name="", _device="/device:GPU:0"]()]]

ubuntu@ip-Address:~/neural-style-tf$

ProGamerGov commented 7 years ago

I solved the issue:

I ran pip install tensorflow by accident instead of pip install tensorflow-gpu. Uninstalling the default CPU Tensorflow and installing the GPU version with pip install tensorflow-gpu solved the issue.

ProGamerGov commented 7 years ago

New error:

I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally

---- RENDERING SINGLE IMAGE ----

I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: Tesla K80
major: 3 minor: 7 memoryClockRate (GHz) 0.8235
pciBusID 0000:00:1e.0
Total memory: 11.17GiB
Free memory: 11.11GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0)
E tensorflow/stream_executor/cuda/cuda_dnn.cc:378] Loaded runtime CuDNN library: 5005 (compatibility version 5000) but source was compiled with 5105 (compatibility version 5100).  If using a binary install, upgrade your CuDNN library to match.  If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration.
F tensorflow/core/kernels/conv_ops.cc:532] Check failed: stream->parent()->GetConvolveAlgorithms(&algorithms)

ProGamerGov commented 7 years ago

I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally

---- RENDERING SINGLE IMAGE ----

I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: Tesla K80
major: 3 minor: 7 memoryClockRate (GHz) 0.8235
pciBusID 0000:00:1e.0
Total memory: 11.17GiB
Free memory: 11.11GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0)
E tensorflow/stream_executor/cuda/cuda_dnn.cc:378] Loaded runtime CuDNN library: 5005 (compatibility version 5000) but source was compiled with 5105 (compatibility version 5100).  If using a binary install, upgrade your CuDNN library to match.  If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration.
F tensorflow/core/kernels/conv_ops.cc:532] Check failed: stream->parent()->GetConvolveAlgorithms(&algorithms)
Aborted (core dumped)
ubuntu@ip-Address:~/neural-style-tf$

Neural-Style and plenty of other programs including those that require Tensorflow, work with my installed CUDA and CuDNN.

ProGamerGov commented 7 years ago

@cysmith I tried reinstalling everything, including CUDA, an CuDNN, but I get the same error.

I am using an Amazon AMI. The exact same error occurs every time.

Edit: I think cuDNN v5 is the problem.

cysmith commented 7 years ago

Hi ProGamerGov,

Hmm. I have a TensorFlow that I built from source a few months ago. Kind of busy this week but I'll look into it as soon as I can.

Cameron

ProGamerGov commented 7 years ago

Ok, so following this guide here:

https://alliseesolutions.wordpress.com/2016/09/08/install-gpu-tensorflow-from-sources-w-ubuntu-16-04-and-cuda-8-0-rc/

seems to install correctly on Ubuntu 16.04 with Cudnn v5.1 and Cuda 8.0. But Tensorflow still cannot detect the GPU when running a simple python script like:

import tensorflow as tf

# Creates a graph.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print sess.run(c)

The output of the script:

ubuntu@ip-Address:~/neural-style-tf$ python tensor_test.py
Device mapping: no known devices.
I tensorflow/core/common_runtime/direct_session.cc:252] Device mapping:
MatMul: /job:localhost/replica:0/task:0/cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:819] MatMul: /job:localhost/replica:0/task:0/cpu:0
b: /job:localhost/replica:0/task:0/cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:819] b: /job:localhost/replica:0/task:0/cpu:0
a: /job:localhost/replica:0/task:0/cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:819] a: /job:localhost/replica:0/task:0/cpu:0
[[ 22.  28.]
 [ 49.  64.]]
ubuntu@ip-Address:~/neural-style-tf$

This issue here: https://github.com/tensorflow/tensorflow/issues/1066 seems to detail a compiler issue that might be a role, or maybe it's something like a bad install?

Checking my g++ version:

ubuntu@ip-Address:~$ g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 5.4.0-6ubuntu1~16.04.4' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)
ubuntu@ip-Address:~$

And gcc version:

ubuntu@ip-Address:~$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 5.4.0-6ubuntu1~16.04.4' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)
ubuntu@ip-Address:~$

Not sure if the issue I linked was related.

Output of nvidia-smi:

ubuntu@ip-Address:~$ nvidia-smi
Thu Jan  5 05:34:41 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.57                 Driver Version: 367.57                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 0000:00:1E.0     Off |                    0 |
| N/A   23C    P0    72W / 149W |      0MiB / 11439MiB |    100%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
ubuntu@ip-Address:~$

The output of nvcc --version:

ubuntu@ip-Address:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2015 NVIDIA Corporation
Built on Tue_Aug_11_14:27:32_CDT_2015
Cuda compilation tools, release 7.5, V7.5.17
ubuntu@ip-Address:~$

Exact Ubuntu Version:

Ubuntu 16.04.1 LTS (GNU/Linux 4.4.0-38-generic x86_64)

cysmith commented 7 years ago

My output of nvcc --version is more recent.

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Wed_May__4_21:01:56_CDT_2016
Cuda compilation tools, release 8.0, V8.0.26

ProGamerGov commented 7 years ago

I don't recall ever installing Cuda 7.5 on my current AMI, and even if I somehow accidentally installed it, shouldn't it have been purged when I purged everything Nvidia to start over again?

I guess I need to start from scratch to try again? Is installing everything on Ubuntu 16.04 still as painful as it was when CUDA 8.0RC was the latest version back in July-October of 2016? Because I had to use various g++ and gcc compiler tricks so that stuff would even compile.

cysmith commented 7 years ago

Yeah I would purge and get CUDA 8.0. I don't know but I understand your pain. Let me know if that is the issue.

ProGamerGov commented 7 years ago

So I started with a fresh install of Ubuntu 16.04 LTS on AWS and followed the guide here: https://alliseesolutions.wordpress.com/2016/09/08/install-gpu-tensorflow-from-sources-w-ubuntu-16-04-and-cuda-8-0-rc/

I used the latest CuDNN v5.1 and CUDA 8.0, and Tensorflow seems to have work properly now.

sudo apt-get remove --purge nvidia-*

https://alliseesolutions.wordpress.com/2016/09/08/install-gpu-tensorflow-from-sources-w-ubuntu-16-04-and-cuda-8-0-rc/

Installation Instructions:
`sudo dpkg -i cuda-repo-ubuntu1604_8.0.44-1_amd64.deb`
`sudo apt-get update`
`sudo apt-get install cuda`

sudo tar -xzvf cudnn-8.0-linux-x64-v5.1.tgz

sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

source ~/.bashrc

echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
curl https://storage.googleapis.com/bazel-apt/doc/apt-key.pub.gpg | sudo apt-key add -
sudo apt-get update
sudo apt-get install bazel
sudo apt-get upgrade bazel

sudo reboot

cd ~
git clone https://github.com/tensorflow/tensorflow

cd ~/tensorflow
./configure

bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

sudo pip install /tmp/tensorflow_pkg/tensorflow

sudo pip install /tmp/tensorflow_pkg/tensorflow-0.12.1-cp27-cp27mu-linux_x86_64.whl

Then I ran the script: https://gist.github.com/ProGamerGov/b1550e5f6b7bf6e032378597262b88d8

ubuntu@ip-Address:~$ python tensor_test.py
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcurand.so.8.0 locally
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:910] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: Tesla K80
major: 3 minor: 7 memoryClockRate (GHz) 0.8235
pciBusID 0000:00:1e.0
Total memory: 11.17GiB
Free memory: 11.11GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0)
Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0
I tensorflow/core/common_runtime/direct_session.cc:256] Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0

MatMul: (MatMul): /job:localhost/replica:0/task:0/gpu:0
I tensorflow/core/common_runtime/simple_placer.cc:827] MatMul: (MatMul)/job:localhost/replica:0/task:0/gpu:0
b: (Const): /job:localhost/replica:0/task:0/gpu:0
I tensorflow/core/common_runtime/simple_placer.cc:827] b: (Const)/job:localhost/replica:0/task:0/gpu:0
a: (Const): /job:localhost/replica:0/task:0/gpu:0
I tensorflow/core/common_runtime/simple_placer.cc:827] a: (Const)/job:localhost/replica:0/task:0/gpu:0
[[ 22.  28.]
 [ 49.  64.]]
ubuntu@ip-Address:~$

And now Tensorflow sees the GPU instead of just the CPU!

ProGamerGov commented 7 years ago

I think it should be working now?

I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcurand.so.8.0 locally

---- RENDERING SINGLE IMAGE ----

W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:910] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: Tesla K80
major: 3 minor: 7 memoryClockRate (GHz) 0.8235
pciBusID 0000:00:1e.0
Total memory: 11.17GiB
Free memory: 11.11GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0)
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:628: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Use `tf.global_variables_initializer` instead.

Though nvidia-smi shows the same usage values regardless of settings:

ubuntu@ip-Address:~$ nvidia-smi
Thu Jan  5 23:59:42 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.57                 Driver Version: 367.57                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 0000:00:1E.0     Off |                    0 |
| N/A   46C    P0   145W / 149W |  10941MiB / 11439MiB |     98%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0      2440    C   python                                       10937MiB |
+-----------------------------------------------------------------------------+
ubuntu@ip-Address:~$

Edit: nm it seems to work:

Single image elapsed time: 91.7761621475

But the --print_iterations 50 parameter does nothing. And it would be nice to have an equivalent of Neural-Style's -save_iter command. Also, what is the equivalent of -normalize_gradients?

cysmith commented 7 years ago

Hi,

You need to include the --verbose flag for printing iterations. I agree but unfortunately, I cannot have the --save_iter functionality with LBFGS so I chose to remove it.

Cam

ProGamerGov commented 7 years ago

I was just about to say that --max_size wasn't workings as well, but --verbose seems to have fixed that.

ProGamerGov commented 7 years ago

I don't think any of the commands are working. Even manually changing the neural_style.py seemed to have no effect.

I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcurand.so.8.0 locally

---- RENDERING SINGLE IMAGE ----

W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:910] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: Tesla K80
major: 3 minor: 7 memoryClockRate (GHz) 0.8235
pciBusID 0000:00:1e.0
Total memory: 11.17GiB
Free memory: 11.11GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0)

BUILDING VGG-19 NETWORK
loading model weights...
constructing layers...
LAYER GROUP 1
--conv1_1 | shape=(1, 400, 400, 64) | weights_shape=(3, 3, 3, 64)
--relu1_1 | shape=(1, 400, 400, 64) | bias_shape=(64,)
--conv1_2 | shape=(1, 400, 400, 64) | weights_shape=(3, 3, 64, 64)
--relu1_2 | shape=(1, 400, 400, 64) | bias_shape=(64,)
--pool1   | shape=(1, 200, 200, 64)
LAYER GROUP 2
--conv2_1 | shape=(1, 200, 200, 128) | weights_shape=(3, 3, 64, 128)
--relu2_1 | shape=(1, 200, 200, 128) | bias_shape=(128,)
--conv2_2 | shape=(1, 200, 200, 128) | weights_shape=(3, 3, 128, 128)
--relu2_2 | shape=(1, 200, 200, 128) | bias_shape=(128,)
--pool2   | shape=(1, 100, 100, 128)
LAYER GROUP 3
--conv3_1 | shape=(1, 100, 100, 256) | weights_shape=(3, 3, 128, 256)
--relu3_1 | shape=(1, 100, 100, 256) | bias_shape=(256,)
--conv3_2 | shape=(1, 100, 100, 256) | weights_shape=(3, 3, 256, 256)
--relu3_2 | shape=(1, 100, 100, 256) | bias_shape=(256,)
--conv3_3 | shape=(1, 100, 100, 256) | weights_shape=(3, 3, 256, 256)
--relu3_3 | shape=(1, 100, 100, 256) | bias_shape=(256,)
--conv3_4 | shape=(1, 100, 100, 256) | weights_shape=(3, 3, 256, 256)
--relu3_4 | shape=(1, 100, 100, 256) | bias_shape=(256,)
--pool3   | shape=(1, 50, 50, 256)
LAYER GROUP 4
--conv4_1 | shape=(1, 50, 50, 512) | weights_shape=(3, 3, 256, 512)
--relu4_1 | shape=(1, 50, 50, 512) | bias_shape=(512,)
--conv4_2 | shape=(1, 50, 50, 512) | weights_shape=(3, 3, 512, 512)
--relu4_2 | shape=(1, 50, 50, 512) | bias_shape=(512,)
--conv4_3 | shape=(1, 50, 50, 512) | weights_shape=(3, 3, 512, 512)
--relu4_3 | shape=(1, 50, 50, 512) | bias_shape=(512,)
--conv4_4 | shape=(1, 50, 50, 512) | weights_shape=(3, 3, 512, 512)
--relu4_4 | shape=(1, 50, 50, 512) | bias_shape=(512,)
--pool4   | shape=(1, 25, 25, 512)
LAYER GROUP 5
--conv5_1 | shape=(1, 25, 25, 512) | weights_shape=(3, 3, 512, 512)
--relu5_1 | shape=(1, 25, 25, 512) | bias_shape=(512,)
--conv5_2 | shape=(1, 25, 25, 512) | weights_shape=(3, 3, 512, 512)
--relu5_2 | shape=(1, 25, 25, 512) | bias_shape=(512,)
--conv5_3 | shape=(1, 25, 25, 512) | weights_shape=(3, 3, 512, 512)
--relu5_3 | shape=(1, 25, 25, 512) | bias_shape=(512,)
--conv5_4 | shape=(1, 25, 25, 512) | weights_shape=(3, 3, 512, 512)
--relu5_4 | shape=(1, 25, 25, 512) | bias_shape=(512,)
--pool5   | shape=(1, 13, 13, 512)
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().

MINIMIZING LOSS USING: ADAM OPTIMIZER
WARNING:tensorflow:From neural_style.py:628: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Use `tf.global_variables_initializer` instead.
At iterate 0    f=  5.04182E+10
At iterate 50   f=  8.23411E+09
At iterate 100  f=  6.94835E+09
At iterate 150  f=  6.53249E+09
At iterate 200  f=  6.31361E+09
At iterate 250  f=  6.16693E+09
At iterate 300  f=  6.06346E+09
At iterate 350  f=  5.98807E+09
At iterate 400  f=  5.93253E+09
At iterate 450  f=  5.88094E+09
At iterate 500  f=  5.84276E+09
At iterate 550  f=  5.81210E+09
At iterate 600  f=  5.78642E+09
At iterate 650  f=  5.77015E+09
At iterate 700  f=  5.75785E+09
At iterate 750  f=  5.77885E+09
At iterate 800  f=  5.77774E+09
At iterate 850  f=  5.73198E+09
At iterate 900  f=  5.70126E+09
At iterate 950  f=  5.69114E+09
Single image elapsed time: 329.488039017

cysmith commented 7 years ago

Ok. You are using a more recent Tensorflow (0.12). How did you manually change the values? I don't understand what you mean because Adam is not a default and masking is not a default.

ProGamerGov commented 7 years ago

Here's the output when using L-BFGS:

ubuntu@ip-Address:~/neural-style-tf$ python neural_style.py --content_img content.jpg --style_imgs style.jpg --verbose
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcurand.so.8.0 locally

---- RENDERING SINGLE IMAGE ----

W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:910] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: Tesla K80
major: 3 minor: 7 memoryClockRate (GHz) 0.8235
pciBusID 0000:00:1e.0
Total memory: 11.17GiB
Free memory: 11.11GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0)

BUILDING VGG-19 NETWORK
loading model weights...
constructing layers...
LAYER GROUP 1
--conv1_1 | shape=(1, 400, 400, 64) | weights_shape=(3, 3, 3, 64)
--relu1_1 | shape=(1, 400, 400, 64) | bias_shape=(64,)
--conv1_2 | shape=(1, 400, 400, 64) | weights_shape=(3, 3, 64, 64)
--relu1_2 | shape=(1, 400, 400, 64) | bias_shape=(64,)
--pool1   | shape=(1, 200, 200, 64)
LAYER GROUP 2
--conv2_1 | shape=(1, 200, 200, 128) | weights_shape=(3, 3, 64, 128)
--relu2_1 | shape=(1, 200, 200, 128) | bias_shape=(128,)
--conv2_2 | shape=(1, 200, 200, 128) | weights_shape=(3, 3, 128, 128)
--relu2_2 | shape=(1, 200, 200, 128) | bias_shape=(128,)
--pool2   | shape=(1, 100, 100, 128)
LAYER GROUP 3
--conv3_1 | shape=(1, 100, 100, 256) | weights_shape=(3, 3, 128, 256)
--relu3_1 | shape=(1, 100, 100, 256) | bias_shape=(256,)
--conv3_2 | shape=(1, 100, 100, 256) | weights_shape=(3, 3, 256, 256)
--relu3_2 | shape=(1, 100, 100, 256) | bias_shape=(256,)
--conv3_3 | shape=(1, 100, 100, 256) | weights_shape=(3, 3, 256, 256)
--relu3_3 | shape=(1, 100, 100, 256) | bias_shape=(256,)
--conv3_4 | shape=(1, 100, 100, 256) | weights_shape=(3, 3, 256, 256)
--relu3_4 | shape=(1, 100, 100, 256) | bias_shape=(256,)
--pool3   | shape=(1, 50, 50, 256)
LAYER GROUP 4
--conv4_1 | shape=(1, 50, 50, 512) | weights_shape=(3, 3, 256, 512)
--relu4_1 | shape=(1, 50, 50, 512) | bias_shape=(512,)
--conv4_2 | shape=(1, 50, 50, 512) | weights_shape=(3, 3, 512, 512)
--relu4_2 | shape=(1, 50, 50, 512) | bias_shape=(512,)
--conv4_3 | shape=(1, 50, 50, 512) | weights_shape=(3, 3, 512, 512)
--relu4_3 | shape=(1, 50, 50, 512) | bias_shape=(512,)
--conv4_4 | shape=(1, 50, 50, 512) | weights_shape=(3, 3, 512, 512)
--relu4_4 | shape=(1, 50, 50, 512) | bias_shape=(512,)
--pool4   | shape=(1, 25, 25, 512)
LAYER GROUP 5
--conv5_1 | shape=(1, 25, 25, 512) | weights_shape=(3, 3, 512, 512)
--relu5_1 | shape=(1, 25, 25, 512) | bias_shape=(512,)
--conv5_2 | shape=(1, 25, 25, 512) | weights_shape=(3, 3, 512, 512)
--relu5_2 | shape=(1, 25, 25, 512) | bias_shape=(512,)
--conv5_3 | shape=(1, 25, 25, 512) | weights_shape=(3, 3, 512, 512)
--relu5_3 | shape=(1, 25, 25, 512) | bias_shape=(512,)
--conv5_4 | shape=(1, 25, 25, 512) | weights_shape=(3, 3, 512, 512)
--relu5_4 | shape=(1, 25, 25, 512) | bias_shape=(512,)
--pool5   | shape=(1, 13, 13, 512)

MINIMIZING LOSS USING: L-BFGS OPTIMIZER
WARNING:tensorflow:From neural_style.py:620: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Use `tf.global_variables_initializer` instead.
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =       480000     M =           10
 This problem is unconstrained.

At X0         0 variables are exactly at the bounds

At iterate    0    f=  7.85852D+11    |proj g|=  3.89032D+06

It's missing the error messages that using ADAM causes:

WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:383: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().
WARNING:tensorflow:From neural_style.py:384: pack (from tensorflow.python.ops.array_ops) is deprecated and will be removed after 2016-12-14.
Instructions for updating:
This op will be removed after the deprecation date. Please switch to tf.stack().

These error messages say that things will/have been removed?

Line 383-384 in neural_style.py are:

  mask = tf.pack(tensors, axis=2)
  mask = tf.pack(mask, axis=0)

Which are part of the following function:

def mask_style_layer(a, x, mask_img):
  _, h, w, d = a.get_shape()
  mask = get_mask_image(mask_img, w.value, h.value)
  mask = tf.convert_to_tensor(mask)
  tensors = []
  for _ in range(d.value): 
    tensors.append(mask)
  **mask = tf.pack(tensors, axis=2)
  mask = tf.pack(mask, axis=0)**
  mask = tf.expand_dims(mask, 0)
  a = tf.mul(a, mask)
  x = tf.mul(x, mask)

Both ADAM and L-BFGS have the issue, so it looks related to the masked style transfer feature. We can also see that something is clearly wrong by observing the loss values where I was using both ADAM and masked style transfer.

At iterate 0    f=  5.04182E+10
At iterate 50   f=  8.23411E+09
At iterate 100  f=  6.94835E+09
At iterate 150  f=  6.53249E+09
At iterate 200  f=  6.31361E+09
At iterate 250  f=  6.16693E+09
At iterate 300  f=  6.06346E+09
At iterate 350  f=  5.98807E+09
At iterate 400  f=  5.93253E+09
At iterate 450  f=  5.88094E+09
At iterate 500  f=  5.84276E+09
At iterate 550  f=  5.81210E+09
At iterate 600  f=  5.78642E+09
At iterate 650  f=  5.77015E+09
At iterate 700  f=  5.75785E+09
At iterate 750  f=  5.77885E+09
At iterate 800  f=  5.77774E+09
At iterate 850  f=  5.73198E+09
At iterate 900  f=  5.70126E+09
At iterate 950  f=  5.69114E+09

Without the --style_mask & --style_mask_imgs commands, the loss falls lower as expected:

At iterate 0    f=  4.22601E+11
At iterate 50   f=  3.43817E+10
At iterate 100  f=  2.53430E+10
At iterate 150  f=  2.26230E+10
At iterate 200  f=  2.12251E+10
At iterate 250  f=  2.03405E+10
At iterate 300  f=  1.97152E+10
At iterate 350  f=  1.92445E+10
At iterate 400  f=  1.88709E+10
At iterate 450  f=  1.85676E+10
At iterate 500  f=  1.83171E+10
At iterate 550  f=  1.81063E+10
At iterate 600  f=  1.79274E+10
At iterate 650  f=  1.77726E+10
At iterate 700  f=  1.76376E+10
At iterate 750  f=  1.75175E+10
At iterate 800  f=  1.74115E+10
At iterate 850  f=  1.73161E+10
At iterate 900  f=  1.72311E+10
At iterate 950  f=  1.71539E+10
Single image elapsed time: 325.5970788

I believe I fixed the issue in the pull request here: https://github.com/cysmith/neural-style-tf/pull/19

cysmith / neural-style-tf

Error when trying to use GPU #18