Error loading a frozen graph ( float incompatible with float_ref ) #161

Closed mhaghighat closed 7 years ago

mhaghighat commented 7 years ago

I froze the 20170131-234652 model using the freeze_graph.py, but I cannot load it in C++.

I first read the binaryproto successfully as:

tensorflow::GraphDef graph_def;
Status load_graph_status =  ReadBinaryProto(tensorflow::Env::Default(), graph_file_name, &graph_def);

But, it gives an error while creating the graph to be used for the session:

std::unique_ptr<tensorflow::Session> session(tensorflow::NewSession(tensorflow::SessionOptions()));
tensorflow::Status sessionCreateStatus = session->Create(graphDef);

The error is:

Invalid argument: Input 0 of node InceptionResnetV1/Block8/Branch_1/Conv2d_0c_3x1/BatchNorm/cond/AssignMovingAvg_1/Switch was passed float from InceptionResnetV1/Block8/Branch_1/Conv2d_0c_3x1/BatchNorm/moving_variance:0 incompatible with expected float_ref.

Any ideas how to solve this problem?

Thanks in advance :)

mhaghighat commented 7 years ago

BTW, this error also happens when I load the .pb model in Python:

ValueError: graph_def is invalid at node 'InceptionResnetV1/Conv2d_1a_3x3/BatchNorm/cond/AssignMovingAvg/Switch': Input tensor 'InceptionResnetV1/Conv2d_1a_3x3/BatchNorm/moving_mean:0' Cannot convert a tensor of type float32 to an input of type float32_ref.

Lunrot commented 7 years ago

Refer to the following url. https://www.bountysource.com/issues/36614355-unable-to-import-frozen-graph-with-batchnorm

mhaghighat commented 7 years ago

Hi @Lunrot, I've already tried the solution mentioned in this link, but it does not work in our case.

Lunrot commented 7 years ago

@mhaghighat Does the same error occur in python? I have not tested it(freeze & load) in C ++, but there is no error in python.

mhaghighat commented 7 years ago

Yes, it gives this error in Python:

ValueError: graph_def is invalid at node 'InceptionResnetV1/Conv2d_1a_3x3/BatchNorm/cond/AssignMovingAvg/Switch': Input tensor 'InceptionResnetV1/Conv2d_1a_3x3/BatchNorm/moving_mean:0' Cannot convert a tensor of type float32 to an input of type float32_ref.

Lunrot commented 7 years ago

my code

        saver = tf.train.import_meta_graph(os.path.join(os.path.expanduser(args.model_dir), 
            'model-' + os.path.basename(os.path.normpath(args.model_dir)) + '.meta'), clear_devices=True)
        saver.restore(sess, tf.train.latest_checkpoint(os.path.expanduser(args.model_dir)))

        output_node_names = 'embeddings'

        # for fixing the bug of batch norm
        gd = sess.graph.as_graph_def()
        for node in gd.node:            
            if node.op == 'RefSwitch':
                node.op = 'Switch'
                for index in xrange(len(node.input)):
                    if 'moving_' in node.input[index]:
                        node.input[index] = node.input[index] + '/read'
            elif node.op == 'AssignSub':
                node.op = 'Sub'
                if 'use_locking' in node.attr: del node.attr['use_locking']
            elif node.op == 'AssignAdd':
                node.op = 'Add'
                if 'use_locking' in node.attr: del node.attr['use_locking']

        converted_graph_def = graph_util.convert_variables_to_constants(sess, gd, output_node_names.split(","))
        tf.train.write_graph(converted_graph_def, args.output_dir, args.output_filename, as_text=False)
mhaghighat commented 7 years ago


I had done the same, but the resulted protobuf was not loading successfully. The only difference between my code and yours was that I was feeding sess.graph.as_graph_def() directly to the second argument of the convert_variables_to_constants as in:

output_graph_def = graph_util.convert_variables_to_constants(
                sess, sess.graph.as_graph_def(), output_node_names.split(","))

However, if I change it to the way that you've done, creating gd = sess.graph.as_graph_def(), and then:

output_graph_def = graph_util.convert_variables_to_constants(
                sess, gd, output_node_names.split(","))

there is no error anymore.

I know it sounds absurd, but this is the case!!! The problem is solved but I'm still confused, why???

Lunrot commented 7 years ago

Since gd (=sess.graph.as_graph_def()) has changed in the bug finxing, gd shoud be used instead of sess.graph.as_graph_def().

mhaghighat commented 7 years ago

I had the sess.graph.as_graph_def() in the bug fixing loop as:

for node in sess.graph.as_graph_def().node:  
    [perform all the fixing]

but the resulted protobuf still had the issue.

Maybe, it cannot alter the nodes in the original sess.graph.as_graph_def(); so a copy of it (i.e., gd) needs to be created on which we can perform the bug fixing. Can it be right?!

ugtony commented 7 years ago

@Lunrot, thanks for your code. I made the same mistake as mhaghighat did(also thank @mhaghighat for finding out the difference).

I noted that tf.gfile.GFile is replaced by tf.train.write_graph. What's the difference between the two functions? Can they be used for save/load interchangeably?

I found some of the tensors' shape information are eliminated from the frozen model. For example, the input shape was (?, 160, 160, 3) in the original model but became in the frozen model. It made me unable to use tensor.get_shape() to check the input shape.

For curiosity, I fed the network with some inputs with different shapes: (90, 160, 160, 3), (90, 220, 280, 3), (90, 160, 160, 1) to see if the network can work with any input shape. The last one failed. It means that shapes doesn't imply all shapes are allowed. So now, I have to hard code a proper input size in my program, which is not so convenient. Do you have any idea why the shape information was gone after the freezing operation?

Lunrot commented 7 years ago

@mhaghighat I think so. @ugtony (90, 160, 160, 1) is probably a black and white image. If you change that image to (90,160,160,3), I think it will work correctly.

ugtony commented 7 years ago

Hi @Lunrot, I can understand why (90, 160, 160, 1) doesn't work and why (90, 220, 280, 3) does work for this network. But without shape information, it's not easy to use the frozen model for those who haven't seen inception_resnet_v1.py before.

Lunrot commented 7 years ago

@ugtony The CNN model is just a classifier. It is common to perform image preprocessing (image resizing, cropping, etc.) on CNN input values.

tengshaofeng commented 7 years ago

@mhaghighat hi, can you show me your args? python freeze_graph.py (args). and can you show me the code how you load the frozen model?

tengshaofeng commented 7 years ago

@mhaghighat @Lunrot @ugtony @rtkaleta @scotthong hi, all after Converting model.ckpt to model.pb, how to load the model with the model.pb?

ugtony commented 7 years ago

@tengshaofeng A great tutorial is here.

mhaghighat commented 7 years ago

@tengshaofeng: I just submitted a pull request with the updated freeze_graph.py. This is how I call the function:

python freeze_graph.py ~/models/facenet/20170131-234652 ~/models/facenet/20170131-234652/facenet.pb

For loading the protobuf graph in Python, you can use:

def load_graph(frozen_graph_filename):
    with tf.gfile.GFile(frozen_graph_filename, "rb") as f:
        graph_def = tf.GraphDef()

    with tf.Graph().as_default() as graph:

    return graph

graph = load_graph('./facenet/facenet.pb')

For loading it in C++, you can use:

tensorflow::GraphDef graphDef;
tensorflow::ReadBinaryProto(tensorflow::Env::Default(), "facenet.pb", &graphDef);

std::unique_ptr<tensorflow::Session> session = tensorflow::NewSession(tensorflow::SessionOptions());
tensorflow::Status sessionCreateStatus = session->Create(graphDef);
davidsandberg commented 7 years ago

This has been fixed in #172.

tengshaofeng commented 7 years ago

@mhaghighat @ugtony thanks so much.

NicoCoallier commented 6 years ago

In my case , I had this error because I was saving the totally of my variable into constant. When I selected only the correct operations in the ouput_node_names, the loading was a success . EX: output_node_names = "Loss/predictions"

cvJie commented 6 years ago

I try to use Facenet by tensorflow C++ API (VS2015),it can load graph,but it doesn't work. Not found: FeedInputs: unable to find feed output phase_train code like this: [tensorflow::Tensor input_tensor(DT_FLOAT, TensorShape({ 2 , iHeight, iWidth, depth })); auto input_tensor_mapped = input_tensor.tensor<float, 4>(); tensorflow::Tensor phase(DT_BOOL, TensorShape()); phase.scalar()() = FALSE; input_tensor.shape(); std::vector<std::pair<std::string, tensorflow::Tensor>> inputs = { { "input", input_tensor }, //it's OK {" phase_train",phase } // it' bad

could you give me a hand?thanks

mhaghighat commented 6 years ago

@cvJie: This works for me:

tensorflow::Tensor phaseTrain(tensorflow::DT_BOOL, tensorflow::TensorShape());
phaseTrain.scalar<bool>()() = false;
std::vector<std::pair<std::string, tensorflow::Tensor>> inputs = { 
{ "input", faceTensor } ,
{ "phase_train", phaseTrain } 
arvidzt commented 5 years ago
with tf.Session() as sess:
    saved_model_dir = "saved_model_dir_signature"
    meta_graph_def = tf.saved_model.loader.load(sess, [tf.saved_model.tag_constants.SERVING], '')
    for node in sess.graph_def.node:
      if node.op == 'RefEnter':
        node.op = 'Enter'
        for index in range(len(node.input)):
          if 'moving_' in node.input[index]:
            node.input[index] = node.input[index] + '/read'
      if node.op == 'RefSwitch':
        node.op = 'Switch'
        for index in range(len(node.input)):
          if 'moving_' in node.input[index]:
            node.input[index] = node.input[index] + '/read'
      elif node.op == 'AssignSub':
        node.op = 'Sub'
        if 'use_locking' in node.attr: del node.attr['use_locking']
      elif node.op == 'AssignAdd':
        node.op = 'Add'
        if 'use_locking' in node.attr: del node.attr['use_locking']

How can i modify the graph_def in session? if i do it in this way, the model saved by sess didn't change from RefSwitch to Switch. Can someone tell me how to modify the graph_def in sess? thanks.

Priyashbhugra commented 4 years ago


I am facing this issue

raise ValueError(str(e)) ValueError: Input 0 of node import/global_step/Assign was passed int32 from import/global_step:0 incompatible with expected int32_ref.

in line below code while loading frozen.pb file

    with tf.Graph().as_default() as graph:

here is my full code: model_dir = '/home/priyash/avod/avod/data/outputs/pyramid_cars_with_aug_example/checkpoint_freeze/pyramid_cars_with_aug_example-00120000.pb' log_dir = '/home/priyash/avod/avod/data/outputs/pyramid_cars_with_aug_example/checkpoint_freeze/logs/'

with tf.Session() as sess: model_filename = model_dir with gfile.FastGFile(model_filename, 'rb') as f:

    graph_def = tf.GraphDef()
    with tf.Graph().as_default() as graph:
    # g_in = tf.import_graph_def(graph_def)
    # print(g_in)

train_writer = tf.summary.FileWriter(log_dir) train_writer.add_graph(sess.graph)

glennford49 commented 4 years ago

FusedBatchNorm/Switch:1 incompatible with expected half error, im trying to convert it to fp16 precision

xiexie123 commented 4 years ago

FusedBatchNorm/Switch:1 incompatible with expected half error, im trying to convert it to fp16 precision

hi I have the save error. do you fix it?