mobilenet_v1_eval.py has a big bug?

GarryLau commented 6 years ago

Totally, I have two problems. Firstly, In the mobilenet_v1_eval.py, yeah, it's eval not train. When is_training=False, you will not get a right result. When is_training=True, you can get a right answer. Obviously, it's wrong! I guess it's a big bug, but I cannot figure it out. Someone knows that?

def build_model():
  """Build the mobilenet_v1 model for evaluation.

  Returns:
    g: graph with rewrites after insertion of quantization ops and batch norm
    folding.
    eval_ops: eval ops for inference.
    variables_to_restore: List of variables to restore from checkpoint.
  """
  g = tf.Graph()
  with g.as_default():
    inputs, labels = flower_input(is_training=False)
    scope = mobilenet_v1.mobilenet_v1_arg_scope(
        is_training=True, weight_decay=0.0)
    with slim.arg_scope(scope):
      logits, _ = mobilenet_v1.mobilenet_v1(
          inputs,
          is_training=True,
          depth_multiplier=FLAGS.depth_multiplier,
          num_classes=FLAGS.num_classes)

    if FLAGS.quantize:
      tf.contrib.quantize.create_eval_graph()

    eval_ops = metrics(logits, labels)

  return g, eval_ops

Secondly, I use export_eval_pbtxt() to get the "mobilenet_v1_eval.pbtxt".

def export_eval_pbtxt():
  """Export eval.pbtxt."""
  g = tf.Graph()
  with g.as_default():
    inputs = tf.placeholder(dtype=tf.float32,shape=[None,224,224,3])
    scope = mobilenet_v1.mobilenet_v1_arg_scope(
        is_training=False, weight_decay=0.0)
    with slim.arg_scope(scope):
      logits, _ = mobilenet_v1.mobilenet_v1(
          inputs,
          is_training=False,
          depth_multiplier=FLAGS.depth_multiplier,
          num_classes=FLAGS.num_classes)
    eval_graph_file = '/home/lg/projects/mobilenet_v1_eval.pbtxt'
    with tf.Session() as sess:
          with open(eval_graph_file, 'w') as f:
            f.write(str(g.as_graph_def()))

Then, frozen the graph:

bazel-bin/tensorflow/python/tools/freeze_graph  \
--input_graph=/home/lg/projects/mobilenet_v1_eval.pbtxt \
--input_checkpoint=/home/lg/projects/checkpoint/model.ckpt-20000 \
--input_binary=false \
--output_graph=/home/lg/projects/frozen_mobilenet_v1_224.pb  \
--output_node_names=MobilenetV1/Predictions/Reshape_1  \
--checkpoint_version=2

Then I use the frozen .pb file to classify an image, it won't get a right result, no matter you use is_training is True or False above to get the "mobilenet_v1_eval.pbtxt". And the classify file is:

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import argparse
import os.path
import re
import sys
import tarfile

import numpy as np
from six.moves import urllib
import tensorflow as tf
FLAGS = None

def create_graph():
  """Creates a graph from saved GraphDef file and returns a saver."""
  # Creates graph from saved graph_def.pb.
  with tf.gfile.FastGFile(os.path.join(FLAGS.model_dir, r'/home/lg/projects/frozen_mobilenet_v1_224.pb'), 'rb') as f:
    graph_def = tf.GraphDef()
    graph_def.ParseFromString(f.read())
    _ = tf.import_graph_def(graph_def,return_elements=['MobilenetV1/Predictions/Reshape_1:0'], name='lg')

def run_inference_on_image(image):
  """Runs inference on an image.
  Args:
    image: Image file name.
  Returns:
    Nothing
  """
  if not tf.gfile.Exists(image):
    tf.logging.fatal('File does not exist %s', image)

  image_data = tf.gfile.FastGFile(image, 'rb').read()
  img_data_jpg = tf.image.decode_jpeg(image_data) #图像解码  
  img_data_jpg = tf.image.convert_image_dtype(img_data_jpg, dtype=tf.float32) #改变图像数据的类型
  img_data_jpg = tf.image.resize_image_with_crop_or_pad(img_data_jpg,224,224)

  # Creates graph from saved GraphDef.
  create_graph()

  with tf.Session() as sess:
    image_data = img_data_jpg.eval().reshape(-1,224,224,3)
    softmax_tensor = sess.graph.get_tensor_by_name('lg/MobilenetV1/Predictions/Reshape_1:0')
    predictions = sess.run(softmax_tensor, {'lg/Placeholder:0': image_data})
    predictions = np.squeeze(predictions)
    print('predictions: ',predictions)
    # Read the labels from label.txt.
    label_path = os.path.join(FLAGS.model_dir, '/home/lg/projects/labels.txt')
    label = np.loadtxt(fname=label_path,dtype=str)

    top_k = predictions.argsort()[-FLAGS.num_top_predictions:][::-1]
    for node_id in top_k:
      label_string = label[node_id]
      score = predictions[node_id]
      print('%s (score = %.5f)' % (label_string, score))

def main(_):
  image = (FLAGS.image_file if FLAGS.image_file else os.path.join(FLAGS.model_dir, 'cropped_panda.jpg'))
  run_inference_on_image(image)

if __name__ == '__main__':
  parser = argparse.ArgumentParser()
  # graph_def.pb: Binary representation of the GraphDef protocol buffer.
  # label.txt: the labels according to data tfrecord
  parser.add_argument(
      '--model_dir',
      type=str,
      default='/tmp/imagenet',
      help='Path to graph_def.pb and label.txt'
  )
  parser.add_argument(
      '--image_file',
      type=str,
      default=r'/home/lg/projects/data/flower_photos/daisy/5673728_71b8cb57eb.jpg',
      help='Absolute path to image file.'
  )
  parser.add_argument(
      '--num_top_predictions',
      type=int,
      default=2,
      help='Display this many predictions.'
  )
  FLAGS, unparsed = parser.parse_known_args()
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)

The mobilenet v1 I used is in https://github.com/tensorflow/models/tree/master/research/slim/nets . I have tried inception v3, whether to eval or classify an image. So I think mobilenet_v1_eval.py must have a bug. Anyone can help me? Many thanks.

All the codes I used are in https://github.com/GarryLau/draft_notes/tree/master/TF .

@Zehaos @aselle @poxvoculi

ljd16 commented 6 years ago

Hello, was this ever resolved?

GarryLau commented 6 years ago

@ljd16 Solved by using transfer learning. But training directly still cannot be successful.

jiachen0212 commented 6 years ago

when i run : bash ./scripts/train_mobilenet_on_imagenet.sh i got: WARNING:tensorflow:From eval_image_classifier.py:94: get_or_create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version. Instructions for updating: Please switch to tf.train.get_or_create_global_step can you give some help? did you get zhe same bug?

Zehaos / MobileNet

mobilenet_v1_eval.py has a big bug? #70