tf 1.0 changes tf.pack -> tf.stack #8

Open ahundt opened 7 years ago

ahundt commented 7 years ago

tf.pack should become tf.stack. There are a number of other changes that need to be applied for tf 1.0.


warmspringwinds commented 7 years ago

@ahundt, I guess I will create a separate branch since tf 1.0 is not backwards compatible.

ahundt commented 7 years ago

the changes were quite minor

vijtad commented 7 years ago

I am getting following error after changing tf.stack when I am running fcn_32s_train.ipynb

TypeError: __init__() got multiple values for keyword argument 'dtype'
FredHaa commented 7 years ago

I think that is a general problem with tf 1.0 Might be related to https://github.com/tensorflow/models/issues/672

vijtad commented 7 years ago

You are correct, it was due to tf.zeros_initializer. Due to api change, I should put parenthesis () to make it work in fc_8s.py Thanks

ghost commented 7 years ago

Is tensorflow 1.0 backend available?

I am using windows python3.5 tensorflow 1.0, and it always reports errors that I cannot fix myself.

vijtad commented 7 years ago

I was able to run sticker example successfully on TF 1.0/Python 3.5 /Windows 10. I used VS 2015 to debug Python code.

Few changes has to be done to make it work.

Segmentation of sticker cat is not producing good result as described in the article. I have attached the images. Please give me suggestion on how to improve segmentation quality. small_cat


I have done following changes

  1. In latest source code of tensorflow/models/slim/nets/vgg.py, parameter fc_conv_padding='VALID' is missing in vgg_16 method. I have to add to make it run.

  2. Changed xRange to range in one of the source code upsampling.py as XRange has been replaced with range in Python 3.5.

  3. tf.pack should become tf.stack.

  4. tf.zeros_initializer should be replaced with tf.zeros_initializer() in vgg.py.

Source code is as follows

from __future__ import division

import os
import sys
import tensorflow as tf
import skimage.io as io
import numpy as np


fcn_16s_checkpoint_path = \

#os.environ["CUDA_VISIBLE_DEVICES"] = '1'

slim = tf.contrib.slim

from tf_image_segmentation.models.fcn_8s import FCN_8s
from tf_image_segmentation.utils.inference import adapt_network_for_any_size_input
from tf_image_segmentation.utils.pascal_voc import pascal_segmentation_lut

number_of_classes = 21

#image_filename = 'C:/Tensorflow/sticker/me.jpg'

image_filename = 'C:/Tensorflow/sticker/small_cat.jpg'

image_filename_placeholder = tf.placeholder(tf.string)

feed_dict_to_use = {image_filename_placeholder: image_filename}

image_tensor = tf.read_file(image_filename_placeholder)

image_tensor = tf.image.decode_jpeg(image_tensor, channels=3)

# Fake batch for image and annotation by adding
# leading empty axis.
image_batch_tensor = tf.expand_dims(image_tensor, axis=0)

# Be careful: after adaptation, network returns final labels
# and not logits
FCN_8s = adapt_network_for_any_size_input(FCN_8s, 32)

pred, fcn_16s_variables_mapping = FCN_8s(image_batch_tensor=image_batch_tensor,

# The op for initializing the variables.
initializer = tf.local_variables_initializer()

#saver = tf.train.Saver()
#saver = tf.train.import_meta_graph('C:/TensorFlow/checkpoints/fcn_8s_checkpoint/model_fcn8s_final.ckpt.meta', clear_devices=True)

with tf.Session() as sess:

    saver = tf.train.Saver()
    saver.restore(sess, "C:/temp/model_fcn8s_final.ckpt")
    #path = 'C:\\temp\\model_fcn8s_final.ckpt'
    #saver = tf.train.import_meta_graph(path + '.meta')
    #saver.restore(sess, tf.train.latest_checkpoint("C:\\temp\\"))
    print("Model restored.") 
    image_np, pred_np = sess.run([image_tensor, pred], feed_dict=feed_dict_to_use)


    # Eroding countour

import skimage.morphology

prediction_mask = (pred_np.squeeze() == 8)

# Let's apply some morphological operations to
# create the contour for our sticker

cropped_object = image_np * np.dstack((prediction_mask,) * 3)

square = skimage.morphology.square(5)

temp = skimage.morphology.binary_erosion(prediction_mask, square)

negative_mask = (temp != True)

eroding_countour = negative_mask * prediction_mask

eroding_countour_img = np.dstack((eroding_countour, ) * 3)

cropped_object[eroding_countour_img] = 248

png_transparancy_mask = np.uint8(prediction_mask * 255)

image_shape = cropped_object.shape

png_array = np.zeros(shape=[image_shape[0], image_shape[1], 4], dtype=np.uint8)

png_array[:, :, :3] = cropped_object

png_array[:, :, 3] = png_transparancy_mask


io.imsave('C:/Tensorflow/sticker/sticker_cat.png', png_array)

vgg.py code is as follows

# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Contains model definitions for versions of the Oxford VGG network.

These model definitions were introduced in the following technical report:

  Very Deep Convolutional Networks For Large-Scale Image Recognition
  Karen Simonyan and Andrew Zisserman
  arXiv technical report, 2015
  PDF: http://arxiv.org/pdf/1409.1556.pdf
  ILSVRC 2014 Slides: http://www.robots.ox.ac.uk/~karen/pdf/ILSVRC_2014.pdf

More information can be obtained from the VGG website:

  with slim.arg_scope(vgg.vgg_arg_scope()):
    outputs, end_points = vgg.vgg_a(inputs)

  with slim.arg_scope(vgg.vgg_arg_scope()):
    outputs, end_points = vgg.vgg_16(inputs)

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf

slim = tf.contrib.slim

def vgg_arg_scope(weight_decay=0.0005):
  """Defines the VGG arg scope.

    weight_decay: The l2 regularization coefficient.

    An arg_scope.
  with slim.arg_scope([slim.conv2d, slim.fully_connected],
    with slim.arg_scope([slim.conv2d], padding='SAME') as arg_sc:
      return arg_sc

def vgg_a(inputs,
  """Oxford Net VGG 11-Layers version A Example.

  Note: All the fully_connected layers have been transformed to conv2d layers.
        To use in classification mode, resize input to 224x224.

    inputs: a tensor of size [batch_size, height, width, channels].
    num_classes: number of predicted classes.
    is_training: whether or not the model is being trained.
    dropout_keep_prob: the probability that activations are kept in the dropout
      layers during training.
    spatial_squeeze: whether or not should squeeze the spatial dimensions of the
      outputs. Useful to remove unnecessary dimensions for classification.
    scope: Optional scope for the variables.

    the last op containing the log predictions and end_points dict.
  with tf.variable_scope(scope, 'vgg_a', [inputs]) as sc:
    end_points_collection = sc.name + '_end_points'
    # Collect outputs for conv2d, fully_connected and max_pool2d.
    with slim.arg_scope([slim.conv2d, slim.max_pool2d],
      net = slim.repeat(inputs, 1, slim.conv2d, 64, [3, 3], scope='conv1')
      net = slim.max_pool2d(net, [2, 2], scope='pool1')
      net = slim.repeat(net, 1, slim.conv2d, 128, [3, 3], scope='conv2')
      net = slim.max_pool2d(net, [2, 2], scope='pool2')
      net = slim.repeat(net, 2, slim.conv2d, 256, [3, 3], scope='conv3')
      net = slim.max_pool2d(net, [2, 2], scope='pool3')
      net = slim.repeat(net, 2, slim.conv2d, 512, [3, 3], scope='conv4')
      net = slim.max_pool2d(net, [2, 2], scope='pool4')
      net = slim.repeat(net, 2, slim.conv2d, 512, [3, 3], scope='conv5')
      net = slim.max_pool2d(net, [2, 2], scope='pool5')
      # Use conv2d instead of fully_connected layers.
      net = slim.conv2d(net, 4096, [7, 7], padding='VALID', scope='fc6')
      net = slim.dropout(net, dropout_keep_prob, is_training=is_training,
      net = slim.conv2d(net, 4096, [1, 1], scope='fc7')
      net = slim.dropout(net, dropout_keep_prob, is_training=is_training,
      net = slim.conv2d(net, num_classes, [1, 1],
      # Convert end_points_collection into a end_point dict.
      end_points = slim.utils.convert_collection_to_dict(end_points_collection)
      if spatial_squeeze:
        net = tf.squeeze(net, [1, 2], name='fc8/squeezed')
        end_points[sc.name + '/fc8'] = net
      return net, end_points
vgg_a.default_image_size = 224

def vgg_16(inputs,
  """Oxford Net VGG 16-Layers version D Example.

  Note: All the fully_connected layers have been transformed to conv2d layers.
        To use in classification mode, resize input to 224x224.

    inputs: a tensor of size [batch_size, height, width, channels].
    num_classes: number of predicted classes.
    is_training: whether or not the model is being trained.
    dropout_keep_prob: the probability that activations are kept in the dropout
      layers during training.
    spatial_squeeze: whether or not should squeeze the spatial dimensions of the
      outputs. Useful to remove unnecessary dimensions for classification.
    scope: Optional scope for the variables.

    the last op containing the log predictions and end_points dict.
  with tf.variable_scope(scope, 'vgg_16', [inputs]) as sc:
    end_points_collection = sc.name + '_end_points'
    # Collect outputs for conv2d, fully_connected and max_pool2d.6
    with slim.arg_scope([slim.conv2d, slim.fully_connected, slim.max_pool2d],
      net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1')
      net = slim.max_pool2d(net, [2, 2], scope='pool1')
      net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2')
      net = slim.max_pool2d(net, [2, 2], scope='pool2')
      net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
      net = slim.max_pool2d(net, [2, 2], scope='pool3')
      net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv4')
      net = slim.max_pool2d(net, [2, 2], scope='pool4')
      net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv5')
      net = slim.max_pool2d(net, [2, 2], scope='pool5')
      # Use conv2d instead of fully_connected layers.
      net = slim.conv2d(net, 4096, [7, 7], padding=fc_conv_padding, scope='fc6')
      net = slim.dropout(net, dropout_keep_prob, is_training=is_training,
      net = slim.conv2d(net, 4096, [1, 1], scope='fc7')
      net = slim.dropout(net, dropout_keep_prob, is_training=is_training,
      net = slim.conv2d(net, num_classes, [1, 1],
      # Convert end_points_collection into a end_point dict.
      end_points = slim.utils.convert_collection_to_dict(end_points_collection)
      if spatial_squeeze:
        net = tf.squeeze(net, [1, 2], name='fc8/squeezed')
        end_points[sc.name + '/fc8'] = net
      return net, end_points
vgg_16.default_image_size = 224

def vgg_19(inputs,
  """Oxford Net VGG 19-Layers version E Example.

  Note: All the fully_connected layers have been transformed to conv2d layers.
        To use in classification mode, resize input to 224x224.

    inputs: a tensor of size [batch_size, height, width, channels].
    num_classes: number of predicted classes.
    is_training: whether or not the model is being trained.
    dropout_keep_prob: the probability that activations are kept in the dropout
      layers during training.
    spatial_squeeze: whether or not should squeeze the spatial dimensions of the
      outputs. Useful to remove unnecessary dimensions for classification.
    scope: Optional scope for the variables.

    the last op containing the log predictions and end_points dict.
  with tf.variable_scope(scope, 'vgg_19', [inputs]) as sc:
    end_points_collection = sc.name + '_end_points'
    # Collect outputs for conv2d, fully_connected and max_pool2d.
    with slim.arg_scope([slim.conv2d, slim.fully_connected, slim.max_pool2d],
      net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1')
      net = slim.max_pool2d(net, [2, 2], scope='pool1')
      net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2')
      net = slim.max_pool2d(net, [2, 2], scope='pool2')
      net = slim.repeat(net, 4, slim.conv2d, 256, [3, 3], scope='conv3')
      net = slim.max_pool2d(net, [2, 2], scope='pool3')
      net = slim.repeat(net, 4, slim.conv2d, 512, [3, 3], scope='conv4')
      net = slim.max_pool2d(net, [2, 2], scope='pool4')
      net = slim.repeat(net, 4, slim.conv2d, 512, [3, 3], scope='conv5')
      net = slim.max_pool2d(net, [2, 2], scope='pool5')
      # Use conv2d instead of fully_connected layers.
      net = slim.conv2d(net, 4096, [7, 7], padding='VALID', scope='fc6')
      net = slim.dropout(net, dropout_keep_prob, is_training=is_training,
      net = slim.conv2d(net, 4096, [1, 1], scope='fc7')
      net = slim.dropout(net, dropout_keep_prob, is_training=is_training,
      net = slim.conv2d(net, num_classes, [1, 1],
      # Convert end_points_collection into a end_point dict.
      end_points = slim.utils.convert_collection_to_dict(end_points_collection)
      if spatial_squeeze:
        net = tf.squeeze(net, [1, 2], name='fc8/squeezed')
        end_points[sc.name + '/fc8'] = net
      return net, end_points
vgg_19.default_image_size = 224

# Alias
vgg_d = vgg_16
vgg_e = vgg_19
ghost commented 7 years ago

Thanks for the reply. I just find a way to use tensorflow upgrade tool to upgrade the folder models and tf-image-segmentation automatically. And it works fine now.

ghost commented 7 years ago

Also I think if you use fcn_8s_checkpoint_path = "C:/TensorFlow/checkpoints/vgg_8.ckpt" instead of fcn_16s_checkpoint_path = "C:/TensorFlow/checkpoints/vgg_16.ckpt", it may produce better segmentation result. But I am not sure.

vijtad commented 7 years ago

fcn_16s_checkpoint_path variable is not being used in the code.

I am using model_fcn8s_final.ckpt provided in dropbox.

saver.restore(sess, "C:/temp/model_fcn8s_final.ckpt")

What is the image size of this cat ?

I have this cat size as width = 144 and height = 85.

I am getting good segmentation of me.jpg and image size is 400x400.

I would like to whether segmentation depends on image size.

On Sat, Mar 11, 2017 at 9:36 PM, zhaozj89 notifications@github.com wrote:

Also I think if you use fcn_8s_checkpoint_path = 'C:/TensorFlow/checkpoints/vgg_8.ckpt' instead of ``fcn_16s_checkpoint_path = 'C:/TensorFlow/checkpoints/vgg_16.ckpt', it may produce better segmentation result. But I am not sure.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/warmspringwinds/tf-image-segmentation/issues/8#issuecomment-285917018, or mute the thread https://github.com/notifications/unsubscribe-auth/AFye3To4b2Aulvg9VRT8Z4cnxW0gHoyYks5rk1oqgaJpZM4MACdU .

ahundt commented 7 years ago

@vijtad can you please submit a pull request with your changes rather than including them in comments? That allows us to review and merge the code.

Here are two documents that explain how: https://gist.github.com/Chaser324/ce0505fbed06b947d962 https://help.github.com/categories/collaborating-with-issues-and-pull-requests/

barryridge commented 7 years ago

Hi @warmspringwinds, @vijtad, @ahundt,

I took the liberty of submitting a PR that adds forward/backward compatibility for tf-1.0.x with version checks at all the fail points in the code based on the output of tf_upgrade_tool.py as suggested by @zhaozj89.

Please have a look and consider merging. As I say in the PR though, I'm unsure if this type of forward/backward compatibility version checking is the best approach for future maintainability. E.g. it would probably break future applications of tf_upgrade_tool.py. In order to retain that option, maintaining separate branches might be better. In any case, I'll leave it up to you.

shivam-kotwalia commented 7 years ago

Hi @barryridge Just tested for the tf_upgrade_tool.py and it worked like a charm. :) :+1:

caxton commented 6 years ago

@vijtad I refer your code and got an error below, any idea? I also found that you use prediction_mask = (pred_np.squeeze() == 8) instead of prediction_mask = (pred_np.squeeze() == 15), why? Any suggestion is appreciated, thanks.

/usr/local/lib/python2.7/dist-packages/ipykernel_launcher.py:77: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 1 but corresponding boolean dimension is 360

ValueError                                Traceback (most recent call last)
<ipython-input-1-16f3a3a662ba> in <module>()
     83 png_array = np.zeros(shape=[image_shape[0], image_shape[1], 4], dtype=np.uint8)
---> 85 png_array[:, :, :3] = cropped_object
     87 png_array[:, :, 3] = png_transparancy_mask

ValueError: could not broadcast input array from shape (360,480,3) into shape (1,360,3)