google / model_search

Apache License 2.0
3.26k stars 462 forks source link

Inputs to the model #43

Open tom-samsung opened 3 years ago

tom-samsung commented 3 years ago

I have a problem with understanding the inputs to the model. I set my experiments on a dataset with 2496 columns using csv file. I provided label_index and record_defaults via a list.

Parameters that I set in single_trainer.SingleTrainer (as according to readme) were label_index, logits_dimension, record_defaults, filename, spec.

After the experiments were done, I started to look at the graphs to understand how to use them in a pipeline where keras models are used (I wanted to wrap selected graph with keras lambda layer and use for inference).

When looking at the graph I see

gdef = gpb.GraphDef()
with open('/Users/tomasz.p/Desktop/GOOGLE_SEARCH_RESULTS/r524xlarge-2/RESULTS/1/graph.pbtxt', 'r') as fh:
    graph_str = fh.read()
pbtf.Parse(graph_str, gdef)
tf.import_graph_def(gdef)
for op in tf.get_default_graph().get_operations():
    print(str(op.name))

import/record_defaults_0 import/record_defaults_1 import/record_defaults_2 import/record_defaults_3 import/record_defaults_4 ... up to 2496 as it should be. I also see:

import/Phoenix/search_generator_0/Input/input_layer/1_1/ExpandDims/dim import/Phoenix/search_generator_0/Input/input_layer/1_1/ExpandDims import/Phoenix/search_generator_0/Input/input_layer/1_1/Shape import/Phoenix/search_generator_0/Input/input_layer/1_1/strided_slice/stack import/Phoenix/search_generator_0/Input/input_layer/1_1/strided_slice/stack_1 import/Phoenix/search_generator_0/Input/input_layer/1_1/strided_slice/stack_2 import/Phoenix/search_generator_0/Input/input_layer/1_1/strided_slice import/Phoenix/search_generator_0/Input/input_layer/1_1/Reshape/shape/1 import/Phoenix/search_generator_0/Input/input_layer/1_1/Reshape/shape import/Phoenix/search_generator_0/Input/input_layer/1_1/Reshape

for all 2496 inputs.

but I only see :

input_1 input_2 input_3 input_4 ... input_21

21 inputs instead of 2496. Could you please help me to understand this situation?

My final goal is to do something like that:

import tensorflow as tf
#import tensorflow.compat.v1 as tf
#To make tf 2.0 compatible with tf1.0 code, we disable the tf2.0 functionalities
#tf.disable_eager_execution()
debug = False
if debug:
    tf.autograph.set_verbosity(3, True)
else:
    tf.autograph.set_verbosity(0, True)
from tensorflow.core.framework.graph_pb2 import GraphDef
from google.protobuf import text_format as pbtf
import numpy as np
@tf.autograph.experimental.do_not_convert
@tf.function
def my_model(x):
    #tf.get_default_graph()
    #tensor_input = ['inputs_'+str(i+1) for i in range(2496)]
    tensor_input = [str(i+1) for i in range(2496)]
    tensor_input_sample = [x[:,i] for i in range(2496)]
    dict_input = {tensor_input[i]: tensor_input_sample[i] for i in range(len(tensor_input))}
    input_map_ = dict_input
    y, z = tf.graph_util.import_graph_def(
        gd, name='', input_map=input_map_, return_elements=['Phoenix/Trainer/ArgMax:0', 'Phoenix/Trainer/Softmax:0'])
    return [y, z]
x = tf.keras.Input(shape=2496)
print(x)
gd = GraphDef()
print("open tf graph")
with open('/Users/tomasz.p/Desktop/GOOGLE_SEARCH_RESULTS/r524xlarge-2/RESULTS/11/graph.pbtxt', 'rb') as f:
    print("read file")
    graph_str = f.read()
print("parse")
pbtf.Parse(graph_str, gd)
print("import graph")
tf.import_graph_def(gd)
y, z = tf.keras.layers.Lambda(my_model)(x)
model = tf.keras.Model(x, [y, z])
model.summary()
y_out, z_out = model.predict(np.ones((5, 2496), dtype=np.float32))
print(y_out.shape, z_out.shape)
print(y_out, z_out)

but unfortunately I do not understand inputs at this point. Thank you for any help!

Xiaoping777 commented 3 years ago

Hi tom, I just downloaded and re-installed the latest version, there is new folder generated with .pb file for each model, I think it might make things easier

Xiaoping777 commented 3 years ago

finally I worked it out, here is the code for 1 case inference @tom-samsung


import numpy as np
import tensorflow.compat.v1 as tf
#To make tf 2.0 compatible with tf1.0 code, we disable the tf2.0 functionalities
tf.disable_eager_execution()

from tensorflow.python.client import session
from tensorflow.python.framework import importer
from tensorflow.python.framework import ops
from tensorflow.python.summary import summary
from tensorflow.python.tools import saved_model_utils
from tensorflow.core.framework import graph_pb2 as gpb
from google.protobuf import text_format as pbtf

def extract_tensors(signature_def, graph):
    output = dict()

    for key in signature_def:
        value = signature_def[key]

        if isinstance(value, tf.TensorInfo):
            output[key] = graph.get_tensor_by_name(value.name)

    return output

def extract_input_name(signature_def, graph):
    input_tensors = extract_tensors(signature_def['serving_default'].inputs, graph)
    #Assuming one input in model.

    name_list = []
    for key in list(input_tensors.keys()):
        name_list.append(input_tensors.get(key).name)

    return name_list

def extract_output_name(signature_def, graph):
    output_tensors = extract_tensors(signature_def['serving_default'].outputs, graph)
    #Assuming one output in model.

    name_list = []
    for key in list(output_tensors.keys()):
        name_list.append(output_tensors.get(key).name)

    return name_list

def ass_input_dict(tensor_input_sample): 
    dict_input = {str(i+1)+":0" : [tensor_input_sample[i]] for i in range(len(tensor_input_sample))}
    return dict_input

checkpoint_path = "/tmp/run/tuner-1/160/saved_model/assets/"

with tf.Session(graph=tf.Graph()) as sess:
    serve = tf.saved_model.load(sess, tags=["serve"], export_dir=checkpoint_path)
    #print(type(model))  <class 'tensorflow.core.protobuf.meta_graph_pb2.MetaGraphDef'>

    #input_tensor_name = extract_input_name(serve.signature_def, sess.graph)
    output_tensor_name = extract_output_name(serve.signature_def, sess.graph)
    input_dict = ass_input_dict(sen_vec.detach().numpy())

    prediction = sess.run(output_tensor_name, feed_dict=input_dict)

print(prediction)
tom-samsung commented 3 years ago

Hey @Xiaoping777 thanks for the code. yes, I noticed that with a new version of repo and saved_models things are much easier now. Unfortunately, I need to re-run everything but it's ok. I'll try to wrap this up into keras lambda layer to have this additional option for people who have keras pipelines and post it somewhere. Maybe authors of this repo will update readme with all those information to make people lives easier before closing those issues: https://github.com/google/model_search/issues/43 https://github.com/google/model_search/issues/39 Thanks again and happy model searching!