google / model_search

Apache License 2.0
3.26k stars 462 forks source link

Predicitions/Inference using the searched model #39

Open virajone opened 3 years ago

virajone commented 3 years ago

I tried to utilize this framework to find the best possible model architecture for a multi class classsification problem. As a result the respective files were created. eg. under /tmp/run_example/tuner-1/1/. The type of files include a checkpoint file, graph.pbtxt, replay_config.pbtxt along with .index, .meta and .data-000000-of-00001 files.

Given that the framework does not store the searched model in a .h5 or SavedModel format, I wanted to know how I could utilize the produced files and load the searched model / restore it, such that I can use it to make predicitions / inference ?

I would really appreciate any help / suggestions in the right direction ! Cheers.

virajone commented 3 years ago

It would also be really helpful if there is some way to convert it to/ save and load it as a tensorflow 2 Keras model !

praneethkv commented 3 years ago

I faced the same issue. There is no prediction pipeline to predict the new data which we might have. I'm guessing there would be an update to use the graph file and recreate the model with weights being stored in checkpoint file.

jayCool commented 3 years ago

The prediction pipeline is deadly wanted.

hanna-maz commented 3 years ago

The new commit should enable "export_saved_model" by default (flag switch to True); and It is calling export after model evaluation based on the flag. This will export every model in the search.

Note that replay_config.pbtxt is a config that when supplied to our binary/code, it shifts from performing search, to creating a binary for that trains that specific architecture/ensemble specified in the config. This way models in the search are re-trainable in case new data comes along.

Xiaoping777 commented 3 years ago

I tried to utilize this framework to find the best possible model architecture for a multi class classsification problem. As a result the respective files were created. eg. under /tmp/run_example/tuner-1/1/. The type of files include a checkpoint file, graph.pbtxt, replay_config.pbtxt along with .index, .meta and .data-000000-of-00001 files.

Given that the framework does not store the searched model in a .h5 or SavedModel format, I wanted to know how I could utilize the produced files and load the searched model / restore it, such that I can use it to make predicitions / inference ?

I would really appreciate any help / suggestions in the right direction ! Cheers.

similar issue here

I did some search, it looks like no matter which way, have to know the model structure (or output layer's) based on the output files in the /tmp/run_example/

this is the one might be the closest to this situation, but have to know the output layer's name etc https://leimao.github.io/blog/Save-Load-Inference-From-TF-Frozen-Graph/

jayCool commented 3 years ago

The new push enables us to find the graph architecture.

praneethkv commented 3 years ago

The new commit should enable "export_saved_model" by default (flag switch to True); and It is calling export after model evaluation based on the flag. This will export every model in the search.

Note that replay_config.pbtxt is a config that when supplied to our binary/code, it shifts from performing search, to creating a binary for that trains that specific architecture/ensemble specified in the config. This way models in the search are re-trainable in case new data comes along.

Does the new commit also export best model/ensemble to do predictions later with new data? This leads to another question about ensemble. As per config and ensemble spec file, ensemble is tried for every 5 models as default and can be configured. However, how to get the ensemble of best of N models that are trained as the final ensemble to use predictions.

virajone commented 3 years ago

The new commit should enable "export_saved_model" by default (flag switch to True); and It is calling export after model evaluation based on the flag. This will export every model in the search.

Note that replay_config.pbtxt is a config that when supplied to our binary/code, it shifts from performing search, to creating a binary for that trains that specific architecture/ensemble specified in the config. This way models in the search are re-trainable in case new data comes along.

Thanks a lot for your response Hanna. Using the replay_config.pbtxt to train that specific architecure on new data sounds interesting, however, I did not completely understand what you meant by supplying the config file to the binary/code. Could you elaborate a bit as to how I can do that ? Do you mean I pass it as the spec to the SingleTrainer/Provider method instead of the deafult DNN?

virajone commented 3 years ago

Further, I tried to load any of the saved models using tf.keras.models.load_model(). What this returns is an AutoTrackable object with no predict signature, summary method. Are there different means to restore these models and use them for predicitons etc ?

image

hanna-maz commented 3 years ago

The models exported are not keras, therefore, you shouldn't load them as keras saved models.

Yes, the replay_config can be supplied as "spec" instead of the default dnn and it will make the platform retrain that specific model.

Finally, we export every model in the run - not the best performing one. The user is able to pick and choose which model he/she wants. There are multiple reasons for this:

  1. The user knows best which metric is most important for its model - i.e., he/she can decide to take the model with minimal loss, or highest accuracy/auc or etc...
  2. Better metrics doesn't imply automatically better model - some teams want a high performing model with the smallest number of parameters. Exploring many model architectures can provide them with verity of models to choose from.

Exporting every model enables the user to choose which model is best for him/her. Each model can be deployed as is (via the exported model in its directory) and/or retrained with more data with the replay_config supplied as spec.

tom-samsung commented 3 years ago

The prediction pipeline is deadly wanted.

+1

tom-samsung commented 3 years ago

I'm having hard time, setting a right way to make predictions on new set:

import tensorflow.compat.v1 as tf
#To make tf 2.0 compatible with tf1.0 code, we disable the tf2.0 functionalities
tf.disable_eager_execution()

from tensorflow.python.client import session
from tensorflow.python.framework import importer
from tensorflow.python.framework import ops
from tensorflow.python.summary import summary
from tensorflow.python.tools import saved_model_utils
from tensorflow.core.framework import graph_pb2 as gpb
from google.protobuf import text_format as pbtf

sess = tf.Session()
gdef = gpb.GraphDef()
with open('/Users/tomasz.p/Desktop/GOOGLE_SEARCH_RESULTS/r524xlarge-2/RESULTS/18/graph.pbtxt', 'r') as fh:
    graph_str = fh.read()
pbtf.Parse(graph_str, gdef)
tf.import_graph_def(gdef)

for op in tf.get_default_graph().get_operations():
    print(str(op.name))

#for n in tf.get_default_graph().as_graph_def().node:
#    print(n.name)

tensor_output = sess.graph.get_tensor_by_name('import/Phoenix/search_generator_0/last_dense_248422238325/logits:0')
tensor_input = sess.graph.get_tensor_by_name('import/Phoenix/search_generator_0/Input/input_layer/1_1/ExpandDims/dim:0')

import pickle
sample, test_label = pickle.load(open("/Users/tomasz.p/Desktop/GOOGLE_SEARCH_RESULTS/r524xlarge-2/RESULTS/11/YS_s4_shap.p", "rb"))
print(sample.shape)

predictions = sess.run(tensor_output, {tensor_input:sample})

print(predictions)

Any suggestions would be highly appreciated, thanks! I guess for Keras users, then one can wrap the whole model in Lambda layer. Anyway, an example for predictions on new set would be a good addition to this repo.

hanna-maz commented 3 years ago

The exported models can be queried with standard saved_model_cli tool. https://www.tensorflow.org/guide/saved_model#run_command

In the dummy example in the readme file, when checking an exported model: $ saved_model_cli show --dir /tmp/check_export/phoenix-tuner-114861/1/saved_model/1616093281/ --tag_set=serve --signature_def serving_default

You get the signature: The given SavedModel SignatureDef contains the following input(s): inputs['1'] tensor_info: dtype: DT_FLOAT shape: (-1) name: 1:0 inputs['2'] tensor_info: dtype: DT_FLOAT shape: (-1) name: 2:0 inputs['3'] tensor_info: dtype: DT_FLOAT shape: (-1) name: 3:0 The given SavedModel SignatureDef contains the following output(s): outputs['log_probabilities'] tensor_info: dtype: DT_FLOAT shape: (-1, 2) name: Phoenix/Trainer/LogSoftmax:0 outputs['logits'] tensor_info: dtype: DT_FLOAT shape: (-1, 2) name: Phoenix/search_generator_0/last_dense_80/logits:0 outputs['predictions'] tensor_info: dtype: DT_INT64 shape: (-1) name: Phoenix/Trainer/ArgMax:0 outputs['probabilities'] tensor_info: dtype: DT_FLOAT shape: (-1, 2) name: Phoenix/Trainer/Softmax:0 Method name is: tensorflow/serving/predict

Notice that the model expects three input keys: "1", "2" and "3". This is because in the readme.md example, we have 4 columns and column index "0" is the label. The model has 4 outputs: logits, probabilities, log_probabilities and predictions.

Next, you can ask for predictions: saved_model_cli run --dir /tmp/check_export/phoenix-tuner-114861/1/saved_model/1616093281/ --input_exprs='1=[1.0];2=[0.0];3=[0.0]' --tag_set 'serve' --signature_def 'serving_default' The above command output is: Result for output key log_probabilities: [[-0.29187196 -1.3738289 ]] Result for output key logits: [[ 0.7339718 -0.34798515]] Result for output key predictions: [0] Result for output key probabilities: [[0.74686414 0.25313586]] Hope this helps. I will add some of the above code snippets to the readme.md.

tom-samsung commented 3 years ago

thanks @hanna-maz Unfortunately I have a strange error when I try to use saved_model_cli show or run command:

OSError: Cannot parse file b'/Users/tomasz.p/Desktop/GOOGLE_SEARCH_RESULTS/r524xlarge-2/RESULTS/11/saved_model.pbtxt': 1:1 : Message type "tensorflow.SavedModel" has no field named "node"..

have you encountered something like that and can point me to the fix?

Xiaoping777 commented 3 years ago

I'm having hard time, setting a right way to make predictions on new set:

import tensorflow.compat.v1 as tf
#To make tf 2.0 compatible with tf1.0 code, we disable the tf2.0 functionalities
tf.disable_eager_execution()

from tensorflow.python.client import session
from tensorflow.python.framework import importer
from tensorflow.python.framework import ops
from tensorflow.python.summary import summary
from tensorflow.python.tools import saved_model_utils
from tensorflow.core.framework import graph_pb2 as gpb
from google.protobuf import text_format as pbtf

sess = tf.Session()
gdef = gpb.GraphDef()
with open('/Users/tomasz.p/Desktop/GOOGLE_SEARCH_RESULTS/r524xlarge-2/RESULTS/18/graph.pbtxt', 'r') as fh:
    graph_str = fh.read()
pbtf.Parse(graph_str, gdef)
tf.import_graph_def(gdef)

for op in tf.get_default_graph().get_operations():
    print(str(op.name))

#for n in tf.get_default_graph().as_graph_def().node:
#    print(n.name)

tensor_output = sess.graph.get_tensor_by_name('import/Phoenix/search_generator_0/last_dense_248422238325/logits:0')
tensor_input = sess.graph.get_tensor_by_name('import/Phoenix/search_generator_0/Input/input_layer/1_1/ExpandDims/dim:0')

import pickle
sample, test_label = pickle.load(open("/Users/tomasz.p/Desktop/GOOGLE_SEARCH_RESULTS/r524xlarge-2/RESULTS/11/YS_s4_shap.p", "rb"))
print(sample.shape)

predictions = sess.run(tensor_output, {tensor_input:sample})

print(predictions)

Any suggestions would be highly appreciated, thanks! I guess for Keras users, then one can wrap the whole model in Lambda layer. Anyway, an example for predictions on new set would be a good addition to this repo.

Hi tom,

I followed your instruction, it looks fine for me, but i got an issue:

ValueError: Cannot feed value of shape (10, 367) for Tensor 'import/Phoenix/prior_generator_0/Input/input_layer/1_1/ExpandDims/dim:0', which has shape '()'

then I realized that 1_1 should be one dimension of the input dataset, so I change the input as tensor_input = sess.graph.get_tensor_by_name('import/filenames:0') and run the model with a csv file

but I got another issue, "FailedPreconditionError: Error while reading resource variable Phoenix/prior_generator_0/last_dense_8115222322/dense/bias from Container: localhost. This could mean that the variable was uninitialized."

looks like your codes only load the graph, but do not load the weights at the check point? any idea and suggestions?

tom-samsung commented 3 years ago

Hi @Xiaoping777 That's an interesting comment. So I tried something like that:

import numpy as np
import pandas as pd
import os
#import tensorflow as tf
import tensorflow.compat.v1 as tf
#To make tf 2.0 compatible with tf1.0 code, we disable the tf2.0 functionalities
tf.disable_eager_execution()
from tensorflow.python.client import session
from tensorflow.python.framework import importer
from tensorflow.python.framework import ops
from tensorflow.python.summary import summary
from tensorflow.python.tools import saved_model_utils
from tensorflow.core.framework import graph_pb2 as gpb
from google.protobuf import text_format as pbtf

sess = tf.Session('', tf.Graph())
with sess.graph.as_default():
    # Read meta graph and checkpoint to restore tf session
    saver = tf.train.import_meta_graph("/Users/tomasz.p/Desktop/GOOGLE_SEARCH_RESULTS/r524xlarge-2/RESULTS/11/model.ckpt-7000000.meta")    
    saver.restore(sess,tf.train.latest_checkpoint('/Users/tomasz.p/Desktop/GOOGLE_SEARCH_RESULTS/r524xlarge-2/RESULTS/11/'))
    tensor_output = sess.graph.get_tensor_by_name('Phoenix/Trainer/ArgMax:0')
    tensor_input = sess.graph.get_tensor_by_name('filenames:0')
    sample = "/Users/tomasz.p/test.csv"
    sess.run(tf.global_variables_initializer())
    predictions = sess.run(tensor_output, {tensor_input:sample})
    print(predictions)

and it crashed with:

FailedPreconditionError: GetNext() failed because the iterator has not been initialized. Ensure that you have run the initializer operation for this iterator before getting the next element.
     [[{{node IteratorGetNext}}]]
During handling of the above exception, another exception occurred:
...

It looks like iterator for csv dataset is not initialize but I could not figure out where it is in the graph and how to initialize it. Please let me know if you have some ideas how to move this froward. Thanks.

tom-samsung commented 3 years ago

Btw @Xiaoping777 I also started this issue: https://github.com/google/model_search/issues/43 My final goal is to wrap the google search output in keras lambda layer to have a very nice option for predictions using keras. I think that a lot of people would like to see that type of code and be able to directly use their keras pipelines with models from this repo.

Xiaoping777 commented 3 years ago

Btw @Xiaoping777 I also started this issue: #43 My final goal is to wrap the google search output in keras lambda layer to have a very nice option for predictions using keras. I think that a lot of people would like to see that type of code and be able to directly use their keras pipelines with models from this repo.

Thanks for your code, I may try to use the individual dimension as input tomorrow, which was demonstrated by @hanna-maz in his reply of command line inference. actually I checked the training model performance from the tensor board, the training performance is pretty good.

tom-samsung commented 3 years ago

Btw @Xiaoping777 I also started this issue: #43 My final goal is to wrap the google search output in keras lambda layer to have a very nice option for predictions using keras. I think that a lot of people would like to see that type of code and be able to directly use their keras pipelines with models from this repo.

Thanks for your code, I may try to use the individual dimension as input tomorrow, which was demonstrated by @hanna-maz in his reply of command line inference. actually I checked the training model performance from the tensor board, the training performance is pretty good.

I have a problem with that approach, the model search saves only pbtxt and there is no saved_model directory with saved_model.pb (as typically in other saved models) so renaming graph.pbtxt to saved_model.pbtxt and checking it via saved_model_cli show --dir doesn't work. Maybe I don't understand something and some additional conversion is needed. Anyway, this approach doesn't work for me and in addition I don't see the right number of inputs. Somehow for my case I see only 21 inputs instead of 2496.

Xiaoping777 commented 3 years ago

In my case I can see all the inputs, from import/Phoenix/prior_generator_0/Input/input_layer/1_1 ~ import/Phoenix/prior_generator_2/Input/input_layer/367_1 . Have to output the element in graph into a csv file

however, it is not a elegant method

tom-samsung commented 3 years ago

In my case I can see all the inputs, from import/Phoenix/prior_generator_0/Input/input_layer/1_1 ~ import/Phoenix/prior_generator_2/Input/input_layer/367_1 . Have to output the element in graph into a csv file

however, it is not a elegant method

So I see: import/Phoenix/search_generator_0/Input/input_layer/1_1/ExpandDims/dim import/Phoenix/search_generator_0/Input/input_layer/1_1/ExpandDims import/Phoenix/search_generator_0/Input/input_layer/1_1/Shape import/Phoenix/search_generator_0/Input/input_layer/1_1/strided_slice/stack import/Phoenix/search_generator_0/Input/input_layer/1_1/strided_slice/stack_1 import/Phoenix/search_generator_0/Input/input_layer/1_1/strided_slice/stack_2 import/Phoenix/search_generator_0/Input/input_layer/1_1/strided_slice import/Phoenix/search_generator_0/Input/input_layer/1_1/Reshape/shape/1 import/Phoenix/search_generator_0/Input/input_layer/1_1/Reshape/shape import/Phoenix/search_generator_0/Input/input_layer/1_1/Reshape

for each of the inputs but am not sure how to feed to that. There are also input_1 input_2 ... input_21 stops on 21 in my case. So it's extremely confusing.

wooooyo commented 3 years ago

The models exported are not keras, therefore, you shouldn't load them as keras saved models.

Yes, the replay_config can be supplied as "spec" instead of the default dnn and it will make the platform retrain that specific model.

Finally, we export every model in the run - not the best performing one. The user is able to pick and choose which model he/she wants. There are multiple reasons for this:

  1. The user knows best which metric is most important for its model - i.e., he/she can decide to take the model with minimal loss, or highest accuracy/auc or etc...
  2. Better metrics doesn't imply automatically better model - some teams want a high performing model with the smallest number of parameters. Exploring many model architectures can provide them with verity of models to choose from.

Exporting every model enables the user to choose which model is best for him/her. Each model can be deployed as is (via the exported model in its directory) and/or retrained with more data with the replay_config supplied as spec.

@hanna-maz Could you tell how to retrain model with more data with the replay_config supplied as spec? I am puzzled of this problem.

maartenjv commented 2 years ago

Hi,

I was running into the same problem and found this way to get inference/prediction for the saved model in python for the case of a csv file (tabular data). In this case the csv is a simple n-by-5 table where the last column gives the labels. Model search was performed as done in the provided example/tutorial with the train csv data, the code loads the test data. The names and dtype of the data in the input dict can be obtained from the "saved_model_cli show" call as shown above. Hope this helps. Will start looking into how to do this for image and timeseries data.

import numpy as np
import tensorflow as tf

loaded = tf.saved_model.load(r'C:/tmp/run_lin_xor_varname_train/tuner-1/18/saved_model/1632837857')
m = loaded.signatures['serving_default']

data = np.genfromtxt("./csv_lin_xor_varname_test.csv", delimiter=',', skip_header=1, dtype=np.float32)
input = {'0':tf.constant(data[:,0]), '1':tf.constant(data[:,1]), '2':tf.constant(data[:,2]), '3':tf.constant(data[:,3])}

output = m(**(input))
predictions = output['predictions'].numpy()

label = data[:,-1]
acc = np.sum(label==predictions,axis=0)/label.shape[0]
print(acc)
Riyank7 commented 2 years ago

@maartenjv Did you find a way to run inference/prediction for the saved model in python for the case of an image file?