Closed luozhouyang closed 5 years ago
export saved model--this is my answer in SO. Hope that helps. You also need to define your signature
@aforwardz thanks for your reply. Here is my code:
export_path = os.path.join(
tf.compat.as_bytes(self.export_base_path),
tf.compat.as_bytes(str(self.version_number))
)
# with tf.device('/gpu:0'):
sess = tf.Session()
saver = tf.train.import_meta_graph(os.path.join(self.model_dir, "translate.ckpt-21000.meta"))
latest_ckpt = tf.train.latest_checkpoint(self.model_dir)
saver.restore(sess, latest_ckpt)
builder = tf.saved_model.builder.SavedModelBuilder(export_path)
# I am not sure this way to create PREDICT_INPUTS and PREDICT_OUTPUTS is right or not.
feature_configs = {
'x': tf.VarLenFeature(shape=[], dtype=tf.string),
'y': tf.VarLenFeature(shape=[], dtype=tf.string)
}
serialized_example = tf.placeholder(tf.string, name="tf_example")
tf_example = tf.parse_example(serialized_example, feature_configs)
x = tf.identity(tf_example['x'], name='x')
y = tf.identity(tf_example['y'], name='y')
predict_input = tf.saved_model.utils.build_tensor_info(x)
predict_output = tf.saved_model.utils.build_tensor_info(y)
predict_signature_def_map = tf.saved_model.signature_def_utils.predict_signature_def(
inputs={
tf.saved_model.signature_constants.PREDICT_INPUTS: predict_input
},
outputs={
tf.saved_model.signature_constants.PREDICT_OUTPUTS: predict_output
}
)
legacy_init_op = tf.group(tf.tables_initializer(), name="legacy_init_op")
builder.add_meta_graph_and_variables(
sess=sess,
tags=[tf.saved_model.tag_constants.SERVING],
signature_def_map={
"predict_signature_map": predict_signature_def_map
},
legacy_init_op=legacy_init_op,
assets_collection=None
)
builder.save()
But errors occur:
2018-01-05 17:20:31.485773: I C:\tf_jenkins\home\workspace\tf-nightly-windows\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1154] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 950, pci bus id: 0000:01:00.0, compute capability: 5.2)
Traceback (most recent call last):
File "D:\ProgramFiles\Python\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1323, in _do_call
return fn(*args)
File "D:\ProgramFiles\Python\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1293, in _run_fn
self._extend_graph()
File "D:\ProgramFiles\Python\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1354, in _extend_graph
self._session, graph_def.SerializeToString(), status)
File "D:\ProgramFiles\Python\Python35\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 473, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation 'gradients/dynamic_seq2seq/encoder/embedding_lookup_grad/ToInt32': Could not satisfy explicit device specification '' because the node was colocated with a group of nodes that required incompatible device '/device:GPU:0'
Colocation Debug Info:
Colocation group had the following types and devices:
Assign: GPU CPU
Const: GPU CPU
StridedSlice: GPU CPU
TensorArrayScatterV3: GPU CPU
Cast: GPU CPU
Identity: GPU CPU
StackV2: GPU CPU
Sub: GPU CPU
Enter: GPU CPU
VariableV2: GPU CPU
RandomUniform: GPU CPU
ScatterSub: GPU CPU
Neg: GPU CPU
Mul: GPU CPU
Add: GPU CPU
L2Loss: CPU
Size: GPU CPU
TensorArrayV3: GPU CPU
ExpandDims: GPU CPU
Reshape: GPU CPU
ConcatV2: GPU CPU
TensorArrayReadV3: GPU CPU
Gather: GPU CPU
StackPopV2: GPU CPU
RealDiv: GPU CPU
BroadcastGradientArgs: GPU CPU
FloorMod: GPU CPU
ShapeN: GPU CPU
ConcatOffset: GPU CPU
StackPushV2: GPU CPU
TensorArrayGradV3: GPU CPU
TensorArrayGatherV3: GPU CPU
Shape: GPU CPU
Floor: GPU CPU
MatMul: GPU CPU
Slice: GPU CPU
Sum: GPU CPU
TensorArrayWriteV3: GPU CPU
[[Node: gradients/dynamic_seq2seq/encoder/embedding_lookup_grad/ToInt32 = Cast[DstT=DT_INT32, SrcT=DT_INT64, _class=["loc:@embeddings/encoder/embedding_encoder"]](gradients/dynamic_seq2seq/encoder/embedding_lookup_grad/Shape)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "E:/PyCharmProjects/GNMT/nmt-demo/export/exporter.py", line 106, in <module>
exporter.export()
File "E:/PyCharmProjects/GNMT/nmt-demo/export/exporter.py", line 26, in export
saver.restore(sess, latest_ckpt)
File "D:\ProgramFiles\Python\Python35\lib\site-packages\tensorflow\python\training\saver.py", line 1683, in restore
{self.saver_def.filename_tensor_name: save_path})
File "D:\ProgramFiles\Python\Python35\lib\site-packages\tensorflow\python\client\session.py", line 889, in run
run_metadata_ptr)
File "D:\ProgramFiles\Python\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1120, in _run
feed_dict_tensor, options, run_metadata)
File "D:\ProgramFiles\Python\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1317, in _do_run
options, run_metadata)
File "D:\ProgramFiles\Python\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1336, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation 'gradients/dynamic_seq2seq/encoder/embedding_lookup_grad/ToInt32': Could not satisfy explicit device specification '' because the node was colocated with a group of nodes that required incompatible device '/device:GPU:0'
Colocation Debug Info:
Colocation group had the following types and devices:
Assign: GPU CPU
Const: GPU CPU
StridedSlice: GPU CPU
TensorArrayScatterV3: GPU CPU
Cast: GPU CPU
Identity: GPU CPU
StackV2: GPU CPU
Sub: GPU CPU
Enter: GPU CPU
VariableV2: GPU CPU
RandomUniform: GPU CPU
ScatterSub: GPU CPU
Neg: GPU CPU
Mul: GPU CPU
Add: GPU CPU
L2Loss: CPU
Size: GPU CPU
TensorArrayV3: GPU CPU
ExpandDims: GPU CPU
Reshape: GPU CPU
ConcatV2: GPU CPU
TensorArrayReadV3: GPU CPU
Gather: GPU CPU
StackPopV2: GPU CPU
RealDiv: GPU CPU
BroadcastGradientArgs: GPU CPU
FloorMod: GPU CPU
ShapeN: GPU CPU
ConcatOffset: GPU CPU
StackPushV2: GPU CPU
TensorArrayGradV3: GPU CPU
TensorArrayGatherV3: GPU CPU
Shape: GPU CPU
Floor: GPU CPU
MatMul: GPU CPU
Slice: GPU CPU
Sum: GPU CPU
TensorArrayWriteV3: GPU CPU
[[Node: gradients/dynamic_seq2seq/encoder/embedding_lookup_grad/ToInt32 = Cast[DstT=DT_INT32, SrcT=DT_INT64, _class=["loc:@embeddings/encoder/embedding_encoder"]](gradients/dynamic_seq2seq/encoder/embedding_lookup_grad/Shape)]]
Caused by op 'gradients/dynamic_seq2seq/encoder/embedding_lookup_grad/ToInt32', defined at:
File "E:/PyCharmProjects/GNMT/nmt-demo/export/exporter.py", line 106, in <module>
exporter.export()
File "E:/PyCharmProjects/GNMT/nmt-demo/export/exporter.py", line 24, in export
saver = tf.train.import_meta_graph(os.path.join(self.model_dir, "translate.ckpt-21000.meta"))
File "D:\ProgramFiles\Python\Python35\lib\site-packages\tensorflow\python\training\saver.py", line 1835, in import_meta_graph
**kwargs)
File "D:\ProgramFiles\Python\Python35\lib\site-packages\tensorflow\python\framework\meta_graph.py", line 660, in import_scoped_meta_graph
producer_op_list=producer_op_list)
File "D:\ProgramFiles\Python\Python35\lib\site-packages\tensorflow\python\util\deprecation.py", line 316, in new_func
return func(*args, **kwargs)
File "D:\ProgramFiles\Python\Python35\lib\site-packages\tensorflow\python\framework\importer.py", line 349, in import_graph_def
op_def=op_def)
File "D:\ProgramFiles\Python\Python35\lib\site-packages\tensorflow\python\framework\ops.py", line 3076, in create_op
op_def=op_def)
File "D:\ProgramFiles\Python\Python35\lib\site-packages\tensorflow\python\framework\ops.py", line 1561, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): Cannot assign a device for operation 'gradients/dynamic_seq2seq/encoder/embedding_lookup_grad/ToInt32': Could not satisfy explicit device specification '' because the node was colocated with a group of nodes that required incompatible device '/device:GPU:0'
Colocation Debug Info:
Colocation group had the following types and devices:
Assign: GPU CPU
Const: GPU CPU
StridedSlice: GPU CPU
TensorArrayScatterV3: GPU CPU
Cast: GPU CPU
Identity: GPU CPU
StackV2: GPU CPU
Sub: GPU CPU
Enter: GPU CPU
VariableV2: GPU CPU
RandomUniform: GPU CPU
ScatterSub: GPU CPU
Neg: GPU CPU
Mul: GPU CPU
Add: GPU CPU
L2Loss: CPU
Size: GPU CPU
TensorArrayV3: GPU CPU
ExpandDims: GPU CPU
Reshape: GPU CPU
ConcatV2: GPU CPU
TensorArrayReadV3: GPU CPU
Gather: GPU CPU
StackPopV2: GPU CPU
RealDiv: GPU CPU
BroadcastGradientArgs: GPU CPU
FloorMod: GPU CPU
ShapeN: GPU CPU
ConcatOffset: GPU CPU
StackPushV2: GPU CPU
TensorArrayGradV3: GPU CPU
TensorArrayGatherV3: GPU CPU
Shape: GPU CPU
Floor: GPU CPU
MatMul: GPU CPU
Slice: GPU CPU
Sum: GPU CPU
TensorArrayWriteV3: GPU CPU
[[Node: gradients/dynamic_seq2seq/encoder/embedding_lookup_grad/ToInt32 = Cast[DstT=DT_INT32, SrcT=DT_INT64, _class=["loc:@embeddings/encoder/embedding_encoder"]](gradients/dynamic_seq2seq/encoder/embedding_lookup_grad/Shape)]]
Process finished with exit code 1
My trained model is a NMT model that used to correct the address. For example: "土海市 浦东新区 张东路 1387 号" --- model --- “上海市 浦东新区 张东路 1387 号”
I think the way I create the signature_def_map is wrong but I have no idea how to correct it. Do you have any ideas?
I exported the model finally!
Here is my code:
if not self.model_dir:
raise ValueError("Please specify a model dir.")
export_path = os.path.join(
tf.compat.as_bytes(self.export_base_path),
tf.compat.as_bytes(str(self.version_number))
)
config = tf.ConfigProto(allow_soft_placement=True)
sess = tf.Session(config=config)
saver = tf.train.import_meta_graph(os.path.join(self.model_dir, "translate.ckpt-21000.meta"))
latest_ckpt = tf.train.latest_checkpoint(self.model_dir)
saver.restore(sess, latest_ckpt)
builder = tf.saved_model.builder.SavedModelBuilder(export_path)
feature_configs = {
'x': tf.FixedLenFeature(shape=[], dtype=tf.string),
'y': tf.FixedLenFeature(shape=[], dtype=tf.string)
}
serialized_example = tf.placeholder(tf.string, name="tf_example")
tf_example = tf.parse_example(serialized_example, feature_configs)
x = tf.identity(tf_example['x'], name='x')
y = tf.identity(tf_example['y'], name='y')
predict_input = x
predict_output = y
predict_signature_def_map = tf.saved_model.signature_def_utils.predict_signature_def(
inputs={
tf.saved_model.signature_constants.PREDICT_INPUTS: predict_input
},
outputs={
tf.saved_model.signature_constants.PREDICT_OUTPUTS: predict_output
}
)
legacy_init_op = tf.group(tf.tables_initializer(), name="legacy_init_op")
builder.add_meta_graph_and_variables(
sess=sess,
tags=[tf.saved_model.tag_constants.SERVING],
signature_def_map={
"predict_signature_map": predict_signature_def_map
},
legacy_init_op=legacy_init_op,
assets_collection=None
)
builder.save()
And here is my export directory(without assets_collection):
|----1
|----saved_model.pb
|----variables
|----variables.data-00000-of-00002
|----variables.data-00001-of-00002
|----varaibles.index
Hi @luozhouyang, I tried your code for exporting and it works fine. Thanks for posting it. Do you also have some code for the client side? (Sorry not sure if I should put this in a new issue, however I think it would be very helpful for people reading this issue to see the example code for both the server and client side for NMT in TensorFlow Serving.)
I am trying to make a call to the gRPC server (adapting the TF Serving example code to TensorFlow NMT). This is my code:
x = ["this is the text to translate"]
request = predict_pb2.PredictRequest()
request.model_spec.name = 'myModelName'
tp = tf.contrib.util.make_tensor_proto(x)
request.inputs['inputs'].CopyFrom(tp)
# 30 secs timeout because it takes a long time to initialise
result = stub.Predict(request, 30.0)
I also tried
x = [["this is the text to translate"]]
and adding
request.inputs['tf_example'].CopyFrom(tp)
I get errors telling me the input is in the wrong format
AbortionError: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="input size does not match signature")
or I get this error:
AbortionError: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="You must feed a value for placeholder tensor 'tf_example' with dtype string
[[Node: tf_example = Placeholder[_output_shapes=[<unknown>], dtype=DT_STRING, shape=<unknown>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]")
Do you have any idea? thanks!
@woodthom2 I am facing the same problem as you. I exported the model but fails when call serving from the client. I think that the predict_input and predict_output tensor is not correct. From the mnist_saved_model.py we can know that the predict_input tensor should be the input of the neural network and the predict_output tensor should be the output of the network, so my code is obviously wrong. That's the problem. I haven't solved this problem. If you work out, please let me know.
@samithaj Thanks for your reply. I read the source code and I think it makes things more complex. It involves new concepts like registry and problem. I believe things can be done much easier than that way. Do you have another ideas? Thanks anyway.
@luozhouyang Can you provide the tensorflow version?
TypeError: Expected binary or unicode string, got <tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7f282d830b00>
and it looks like that is on the VarLenFeature
for x
variable:
TypeError: Failed to convert object of type <class 'tensorflow.python.framework.sparse_tensor.SparseTensor'> to Tensor. Contents: SparseTensor(indices=Tensor("ParseExample_4/ParseExample:0", shape=(?, 2), dtype=int64), values=Tensor("ParseExample_4/ParseExample:2", shape=(?,), dtype=string), dense_shape=Tensor("ParseExample_4/ParseExample:4", shape=(2,), dtype=int64)). Consider casting elements to a supported type.
Also, when you export, did you use GPU or CPU on the model file?
In case anybody is having the same problem, I didn't get NMT to work together with Serving directly. The Serving examples all use simpler models such as Inception which have a clearly defined input and output placeholder, whereas NMT uses the newer Datasets API and it's not clear what the equivalents would be.
However I did use a workaround to get NMT to work on a server as a REST API. I took the example code for NMT, which is reading a text file and rewriting to another text file, and I refactored this to receive input from a REST API and return the response via REST. So there is no use of gRPC but you could adapt the same approach for gRPC.
@luozhouyang Until the problems in TensorFlow Serving this thread are fixed, I would suggest trying my approach and you will get NMT working on a server.
@woodthom2 your solution sounds good. Currently I start multi docker containers that provide inference service and use haproxy as load balancing. It works fine, but not efficient. I am interested in your solution. Can you share your code with me?
@bugra The tensorflow version in my code is 1.4.1 with gpu. You can export the model using CPU if you set another argument to True
in builder.add_meta_graph_adn_variables()
method. I can not remember the arg name exactly, you can check the docs.
@luozhouyang I wrote my exact problem here: https://github.com/tensorflow/serving/issues/777 Do you mind commenting there? I still have a hard time to understand how to interpret feed_dict
to tensorflow_serving.
@aforwardz @woodthom2 @samithaj @bugra @ewilderj GOOD NEWS! I exported the model and it works well with tf serving these days! I have made a pull request to tensorflow/nmt
, you can have a look: pull request#344. Or, you can visit my fork of tensorflow/nmt
at: tensorflow/nmt
@luozhouyang Can you share your client?
@mdasadul Here is my client:
class GNMTClient(Client):
def __init__(self, model_name="address", host="localhost", port=9000, timeout=10):
self.model_name = model_name
self.host = host
self.port = port
self.timeout = timeout
channel = implementations.insecure_channel(self.host, self.port)
self.stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)
def request(self, input_seq):
future = self._translate(input_seq)
result = self._parse_result(future)
return (input_seq, result)
def request_many(self, input_seqs):
futures = []
for s in input_seqs:
future = self._translate(s)
futures.append(future)
pairs = []
for seq, future in zip(input_seqs, futures):
result = self._parse_result(future)
pairs.append((seq, result))
return pairs
def _parse_result(self, future):
result = self._parse_translation(future.result())
words = ""
for w in list(result):
words += str(w, encoding="utf8") + " "
return words
def _translate(self, seq):
request = predict_pb2.PredictRequest()
# model_name should keep the same as tf serving start arg `--model_name`
request.model_spec.name = self.model_name
# signature_name should keep the same as your `signature_def_map` 's `key` in `exporter`
request.model_spec.signature_name = "serving_default"
# `seq_input` should be the same as the `inference_signature` in the `exporter`
request.inputs["seq_input"].CopyFrom(tf.make_tensor_proto(seq, dtype=tf.string, shape=[1, ]))
return self.stub.Predict.future(request, self.timeout)
@staticmethod
def _parse_translation(result):
# `seq_output` should be the same as the `inference_signature` in the `exporter`
inference_output = tf.make_ndarray(result.outputs["seq_output"])
return inference_output
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--model_name", required=True, help="model name")
parser.add_argument("--host", default="localhost", help="model server host")
parser.add_argument("--port", type=int, default=9000, help="model server port")
parser.add_argument("--timeout", type=float, default=10.0, help="request timeout")
args = parser.parse_args()
test_seqs = [
"上海 浦东新区 张东路",
"浙江 杭州 下沙区",
"北京市 海淀区 北京西路"
]
client = GNMTClient(model_name=args.model_name, host=args.host, port=args.port, timeout=args.timeout)
input_seq, output_seq = client.request(test_seqs[0])
print("Input : %s" % input_seq)
print("Output: %s" % output_seq)
results = client.request_many(test_seqs)
for r in results:
print("Input : %s" % r[0])
print("Output: %s" % r[1])
For run the client, you need to install the dependencies:
@luozhouyang here is my client, and it runs, but i get the same result every time no matter what the inputs are. can you help me? also i tried your code but i got an error in the first line:
class GNMTClient(Client)
because Client is not found
# -*- coding : utf-8 -*-
from __future__ import print_function
import argparse
from nltk import word_tokenize
import time
import json
import os
import tensorflow as tf
from grpc.beta import implementations
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2
PYTHONIOENCODING="UTF-8"
def parse_translation_result(args,result):
hypotheses = tf.make_ndarray(result.outputs["seq_output"])
str1 = ' '.join(str(e) for e in hypotheses if e != "</s>")
return str1
def translate(stub, model_name, tokens, timeout=5.0):
request = predict_pb2.PredictRequest()
request.model_spec.name = model_name
request.model_spec.signature_name = tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY
request.inputs["seq_input"].CopyFrom(tf.contrib.util.make_tensor_proto(tokens))
xy = stub.Predict.future(request, timeout)
return xy
def main():
json_data = {}
start = time.time()
json_data['start'] = str(time.ctime(int(start)))
parser = argparse.ArgumentParser(description="Translation client")
parser.add_argument("--model_name", required=True,
help="model name (name of the file?)")
parser.add_argument("--host", default="localhost",
help="model server host")
parser.add_argument("--port", type=int, default=9000,
help="model server port")
parser.add_argument("--timeout", type=float, default=10.0,
help="request timeout")
parser.add_argument("--text", default="",
help="Untokenized input text")
args = parser.parse_args()
channel = implementations.insecure_channel(args.host, args.port)
stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)
tokens_list = []
if args.text != "":
json_data['input'] = args.text
text = args.text
tokens = text.split()
tokens_list.append(tokens)
token_count = 0
for tokens in tokens_list:
trans = translate(stub, args.model_name, tokens, timeout=args.timeout)
result = trans.result()
best_result = parse_translation_result(args,result)
json_data['result_'+str(token_count)] = best_result
end = time.time()
json_data['duration'] = str(round(end-start,3))+" sec"
json_data['end'] = str(time.ctime(int(end)))
json_result = json.dumps(json_data, sort_keys=True)
print (json_result)
if __name__ == "__main__":
main()
@ptamas88 I think that same inference output in different inputs is not related to the exporting but to your pre-trained model.
The Client
is just an abstract class:
class Client:
def request(self, input_seq):
raise NotImplementedError()
def request_many(self, input_seqs)
raise NotImplementedError()
@luozhouyang i have tried the model with the nmt inference command and it works well. Strange thing with serving is that I always get the first lines translation in the output. It seems that the input is not loaded into the predicition protobuf, and this way the translation is always the first line.
With the Client abstract i successfully ran your client also, but i get the same result: this is an english-hungarian model, and the output sentence is always the first line in the hungarian corpus. The strange appearance is because i use pretokenized corpus but it doesnt affect the results. (I expect not perfect but different results.)
Input : Hello world
Output: ▁ # ▁Még ▁soha ▁nem ▁álmodtam . </s> . </s> . </s> . </s> , ▁kérlek .
Input : Hello world
Output: ▁ # ▁Még ▁soha ▁nem ▁álmodtam . </s> . </s> . </s> . </s> , ▁kérlek .
Input : What's up?
Output: ▁ # ▁Még ▁soha ▁nem ▁álmodtam . </s> . </s> . </s> . </s> , ▁kérlek .
Input : My name is Bond
Output: ▁ # ▁Még ▁soha ▁nem ▁álmodtam . </s> . </s> . </s> . </s> , ▁kérlek .
When you run your client does it give back different (and acceptable) result?
FYI: I use Tensorflow 1.6 for training and tensorflow-serving-api 1.5 for serving on a different machine.
@ptamas88 I do get same results sometimes, but also get different results in my test. But the model used in my test is just a very simple model that trained only a few steps. I'll do more test and check the code again. Thanks for pointing out the problem!
@luozhouyang I think something is missing around the input placeholder during the exporting. But i dont have deep knowledge in this area :( Please let me know if you reach some development :) Thank you very much
@ptamas88 I think I found the reason. It's all due to the --infer_file
argument. Serving will take this file as input of the inference and return results of that input. I am working on it.
@luozhouyang I'm using your code(https://github.com/luozhouyang/nmt/commit/dfab5a285165e5e297b96605f04a41512e0daf3b) to export model, the model can be served but it always returns the same result.
I guess it related to the --infer_file
argument. Any update on it?
Thanks.
@msk86 I have same problem too! Any update @luozhouyang ?
Closing as we are moving help and support to Stack Overflow:
https://stackoverflow.com/questions/tagged/tensorflow-serving
If you open a GitHub issue, it must be a bug, a feature request, or a significant problem with documentation (for small docs fixes please send a PR instead).
Thanks!
Holp this can help you. @luozhouyang @nguyenvulebinh @msk86 First: change exporter.py
def export(self):
infer_model = self._create_infer_model()
with tf.Session(graph=infer_model.graph,
config=tf.ConfigProto(allow_soft_placement=True)) as sess:
feature_config = {
'input': tf.FixedLenSequenceFeature(dtype=tf.string,
shape=[], allow_missing=True),
}
serialized_example = tf.placeholder(dtype=tf.string, name="serialized_example")
tf_example = tf.parse_example(serialized_example, feature_config)
inference_input = tf.identity(tf_example['input'], name="infer_input")
saver = infer_model.model.saver
saver.restore(sess, self._ckpt_path)
sess.run(tf.tables_initializer())
# note here. Do not use decode func of model.
inference_outputs = infer_model.model.sample_words
inference_signature = tf.saved_model.signature_def_utils.predict_signature_def(
inputs={
'seq_input': inference_input
},
outputs={
'seq_output': tf.convert_to_tensor(inference_outputs)
}
)
legacy_ini_op = tf.group(tf.tables_initializer(), name='legacy_init_op')
builder = tf.saved_model.builder.SavedModelBuilder(self._export_dir)
change model_helper.py Do not use dataset API
def pre_process(src_string, src_vocab_table, eos, src_max_len=35):
src_eos_id = tf.cast(src_vocab_table.lookup(tf.constant(eos)), tf.int32)
src_string = tf.string_split([src_string]).values
if src_max_len:
src_string = src_string[:, src_max_len]
# Convert the word strings to ids
src = tf.cast(src_vocab_table.lookup(src_string), tf.int32)
# Add in the word counts.
src = tf.expand_dims(src, axis=0)
src_len = tf.size(src)
src_len = tf.expand_dims(src_len, axis=0)
return BatchedInput(
initializer=None,
source=src,
target_input=None,
target_output=None,
source_sequence_length=src_len,
target_sequence_length=None)
def create_infer_model(model_creator, hparams, scope=None, extra_args=None):
"""Create inference model."""
graph = tf.Graph()
src_vocab_file = hparams.src_vocab_file
tgt_vocab_file = hparams.tgt_vocab_file
with graph.as_default(), tf.container(scope or "infer"):
src_vocab_table, tgt_vocab_table = vocab_utils.create_vocab_tables(
src_vocab_file, tgt_vocab_file, hparams.share_vocab)
reverse_tgt_vocab_table = lookup_ops.index_to_string_table_from_file(
tgt_vocab_file, default_value=vocab_utils.UNK)
src_placeholder = tf.placeholder(shape=[None], dtype=tf.string)
batch_size_placeholder = tf.constant(1, tf.int64)
iterator = pre_process(
src_placeholder,
src_vocab_table,
eos=hparams.eos,
src_max_len=hparams.src_max_len_infer)
model = model_creator(
hparams,
iterator=iterator,
mode=tf.contrib.learn.ModeKeys.INFER,
source_vocab_table=src_vocab_table,
target_vocab_table=tgt_vocab_table,
reverse_target_vocab_table=reverse_tgt_vocab_table,
scope=scope,
extra_args=extra_args)
return InferModel(
graph=graph,
model=model,
src_placeholder=src_placeholder,
batch_size_placeholder=batch_size_placeholder,
iterator=iterator)
This solution works in my case(I'am not using tensorflow-serving). Inference_signature should also need to change while using serving module and it's not hard.
@bidai541 how did you deploy your model ? can you please share your code.
@harshS26 The main modification is "model_helper.py". I do not use tensorflow-serving, in my case, I exported the model to do inference in spark cluster.
@bidai541 @luozhouyang i did modify the model_helper.py and exported the model using https://github.com/tensorflow/nmt/pull/344 but while making the grpc request i am getting the following error
grpc.framework.interfaces.face.face.AbortionError: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="You must feed a value for placeholder tensor 'src_placeholder' with dtype string
[[{{node src_placeholder}} = Placeholder[_output_shapes=[
@harshS26 I think you should move out the definition of placeholder
src_placeholder = tf.placeholder(shape=[None], dtype=tf.string)
to export.py then create the graph using this placeholder. Add it to the params of model_helper.create_infer_model().
@bidai541 I am also getting the same issue as @harshS26
@bidai541 I made changes in exporter.py and the error got resolved. But now i get only one word as output for every input sequence, which i think is because of this line:
inference_outputs = infer_model.model.sample_words
@bharat-robotics try this.. exporter.py
def export(self):
infer_model = self._create_infer_model()
with tf.Session(graph=infer_model.graph,
config=tf.ConfigProto(allow_soft_placement=True)) as sess:
feature_config = {
'input': tf.FixedLenSequenceFeature(dtype=tf.string,
shape=[], allow_missing=True),
}
inference_input = infer_model.graph.get_tensor_by_name('src_placeholder:0')
saver = infer_model.model.saver
saver.restore(sess, self._ckpt_path)
sess.run(tf.tables_initializer())
inference_outputs = infer_model.model.sample_words
inference_output = inference_outputs[0]
inference_signature = tf.saved_model.signature_def_utils.predict_signature_def(
inputs={
'seq_input': inference_input
},
outputs={
'seq_output': tf.convert_to_tensor(inference_output)
}
)
model_helper.py
src_placeholder = tf.placeholder(dtype=tf.string,name="src_placeholder")
batch_size_placeholder = tf.constant(1, tf.int64)
iterator = pre_process(
src_placeholder,
src_vocab_table,
eos=hparams.eos,
src_max_len=hparams.src_max_len_infer)
@harshS26 You are right. This is my inference function in spark cluster. The output is "decoder_output_bm".
def predict(iterator):
"""
For each partition, load the pb model and feed in each item of the RDD to the placehold
"""
from tensorflow.contrib.seq2seq.python.ops import beam_search_ops # note: do not remove this line
result = []
sess = tf.Session()
graph = tf.get_default_graph()
tf.saved_model.loader.load(sess, [tf.saved_model.tag_constants.SERVING], model_path)
input_feature = graph.get_tensor_by_name("src_placeholder:0")
output = graph.get_tensor_by_name("decoder_output_bm:0")
for item in iterator:
one_query_result = []
predict_result = sess.run([output], feed_dict={input_feature: [item[1]]})
predict_result = predict_result[0]
predict_result = np.transpose(predict_result, [2, 1, 0])
predict_result = np.squeeze(predict_result, axis=1)
for idx in range(args.beam_size):
one_query_result.append(" ".join(list(predict_result[idx, :])))
result.append((item[0] + "\t" + item[1] + "\t" + item[2], one_query_result))
return iter(result)
@harshS26 You are right. This is my inference function in spark cluster. The output is "decoder_output_bm".
def predict(iterator): """ For each partition, load the pb model and feed in each item of the RDD to the placehold """ from tensorflow.contrib.seq2seq.python.ops import beam_search_ops # note: do not remove this line result = [] sess = tf.Session() graph = tf.get_default_graph() tf.saved_model.loader.load(sess, [tf.saved_model.tag_constants.SERVING], model_path) input_feature = graph.get_tensor_by_name("src_placeholder:0") output = graph.get_tensor_by_name("decoder_output_bm:0") for item in iterator: one_query_result = [] predict_result = sess.run([output], feed_dict={input_feature: [item[1]]}) predict_result = predict_result[0] predict_result = np.transpose(predict_result, [2, 1, 0]) predict_result = np.squeeze(predict_result, axis=1) for idx in range(args.beam_size): one_query_result.append(" ".join(list(predict_result[idx, :]))) result.append((item[0] + "\t" + item[1] + "\t" + item[2], one_query_result)) return iter(result)
I can try using your approach but I have few questions on this: 1.
predict_result = sess.run([output], feed_dict={input_feature: [item[1]]})
what is item[1] in this? 2.Which variable do i assign the name decoder_output_bm ?
@harshS26 You are right. This is my inference function in spark cluster. The output is "decoder_output_bm".
def predict(iterator): """ For each partition, load the pb model and feed in each item of the RDD to the placehold """ from tensorflow.contrib.seq2seq.python.ops import beam_search_ops # note: do not remove this line result = [] sess = tf.Session() graph = tf.get_default_graph() tf.saved_model.loader.load(sess, [tf.saved_model.tag_constants.SERVING], model_path) input_feature = graph.get_tensor_by_name("src_placeholder:0") output = graph.get_tensor_by_name("decoder_output_bm:0") for item in iterator: one_query_result = [] predict_result = sess.run([output], feed_dict={input_feature: [item[1]]}) predict_result = predict_result[0] predict_result = np.transpose(predict_result, [2, 1, 0]) predict_result = np.squeeze(predict_result, axis=1) for idx in range(args.beam_size): one_query_result.append(" ".join(list(predict_result[idx, :]))) result.append((item[0] + "\t" + item[1] + "\t" + item[2], one_query_result)) return iter(result)
I can try using your approach but I have few questions on this: 1.
predict_result = sess.run([output], feed_dict={input_feature: [item[1]]})
what is item[1] in this? 2.Which variable do i assign the name decoder_output_bm ?
Sorry, my mistake.
Anthor modification in model.py of line 121
elif self.mode == tf.contrib.learn.ModeKeys.INFER:
self.infer_logits, _, self.final_context_state, self.sample_id = res
self.sample_words = reverse_target_vocab_table.lookup(
tf.to_int64(self.sample_id), name="decoder_output_bm")
This is same node with "inference_outputs = infer_model.model.sample_words". Maybe something else went wrong, a suggestion, you can print the source string before feeding into graph.
@bidai541 this is the error i get while exporting
KeyError: "The name 'decoder_output_bm:0' refers to a Tensor which does not exist
@bidai541 Thanks...i get proper prediction using your code.
@harshS26 can u share your detailed code?
@bharat-robotics try this.. exporter.py
def export(self): infer_model = self._create_infer_model() with tf.Session(graph=infer_model.graph, config=tf.ConfigProto(allow_soft_placement=True)) as sess: feature_config = { 'input': tf.FixedLenSequenceFeature(dtype=tf.string, shape=[], allow_missing=True), } inference_input = infer_model.graph.get_tensor_by_name('src_placeholder:0') saver = infer_model.model.saver saver.restore(sess, self._ckpt_path) sess.run(tf.tables_initializer()) inference_outputs = infer_model.model.sample_words inference_output = inference_outputs[0] inference_signature = tf.saved_model.signature_def_utils.predict_signature_def( inputs={ 'seq_input': inference_input }, outputs={ 'seq_output': tf.convert_to_tensor(inference_output) } )
model_helper.py
src_placeholder = tf.placeholder(dtype=tf.string,name="src_placeholder") batch_size_placeholder = tf.constant(1, tf.int64) iterator = pre_process( src_placeholder, src_vocab_table, eos=hparams.eos, src_max_len=hparams.src_max_len_infer)
use your code ,i meet error: grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with: status = StatusCode.INVALID_ARGUMENT details = "input must be a vector, got shape: [1,1] [[{{node StringSplit}}]]" debug_error_string = "{"created":"@1550828332.451979981","description":"Error received from peer","file":"src/core/lib/surface/call.cc","file_line":1095,"grpc_message":"input must be a vector, got shape: [1,1]\n\t [[{{node StringSplit}}]]","grpc_status":3}"
@bidai541 I made changes in exporter.py and the error got resolved. But now i get only one word as output for every input sequence, which i think is because of this line:
inference_outputs = infer_model.model.sample_words
i get the same issue, just one word for every input
@harshS26 You are right. This is my inference function in spark cluster. The output is "decoder_output_bm".
def predict(iterator): """ For each partition, load the pb model and feed in each item of the RDD to the placehold """ from tensorflow.contrib.seq2seq.python.ops import beam_search_ops # note: do not remove this line result = [] sess = tf.Session() graph = tf.get_default_graph() tf.saved_model.loader.load(sess, [tf.saved_model.tag_constants.SERVING], model_path) input_feature = graph.get_tensor_by_name("src_placeholder:0") output = graph.get_tensor_by_name("decoder_output_bm:0") for item in iterator: one_query_result = [] predict_result = sess.run([output], feed_dict={input_feature: [item[1]]}) predict_result = predict_result[0] predict_result = np.transpose(predict_result, [2, 1, 0]) predict_result = np.squeeze(predict_result, axis=1) for idx in range(args.beam_size): one_query_result.append(" ".join(list(predict_result[idx, :]))) result.append((item[0] + "\t" + item[1] + "\t" + item[2], one_query_result)) return iter(result)
I can try using your approach but I have few questions on this: 1.
predict_result = sess.run([output], feed_dict={input_feature: [item[1]]})
what is item[1] in this? 2.Which variable do i assign the name decoder_output_bm ?Sorry, my mistake. Anthor modification in model.py of line 121
elif self.mode == tf.contrib.learn.ModeKeys.INFER: self.infer_logits, _, self.final_context_state, self.sample_id = res self.sample_words = reverse_target_vocab_table.lookup( tf.to_int64(self.sample_id), name="decoder_output_bm")
This is same node with "inference_outputs = infer_model.model.sample_words". Maybe something else went wrong, a suggestion, you can print the source string before feeding into graph.
what is the iterator?
@bidai541 Hi with the modifications in model_helper.py i get following error:
Traceback (most recent call last):
File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/baishali/MLPerf/inference/cloud/translation/gnmt/tensorflow/nmt/nmt.py", line 735, in
So the exact place that throws the error is from inference.py with infer_model.graph.as_default(): sess.run( infer_model.iterator.initializer, feed_dict={ infer_model.src_placeholder: infer_data, infer_model.batch_size_placeholder: hparams.infer_batch_size })
I am guessing this is because of this line iterator = pre_process(src_placeholder,src_vocab_table,eos=hparams.eos,src_max_len=hparams.src_max_len_infer) Which makes infer_model.iterator.initializer =None
@harshS26
How did you get the proper prediction, can you please paste the entire code?
Hi @baishalichaudhury @B1gMinnow , i have attached my code here..you guys can check nmt.zip
Hi
Thanks for your reply. Do you have a readme to briefly explain some of the changes you had to make in the code. Also out of curiosity, did you try restroring the saved .pb file and making an test translation or inference without a client based approach. So for example I want to make a simple test inference code, which will load the saved .pb model, restore placeholders, read an input file with few sentences, (preoprocess the strings maybe?) and translate . My confusions in this approach is how to restore the iterator and initialize before inference. Do you have any experience with this?
Thanks for all your help.
Regards
On Tue, Mar 26, 2019 at 11:42 PM Harshvardhan Singh < notifications@github.com> wrote:
Hi @baishalichaudhury https://github.com/baishalichaudhury @B1gMinnow https://github.com/B1gMinnow , i have attached my code here..you guys can check nmt.zip https://github.com/tensorflow/serving/files/3011842/nmt.zip
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tensorflow/serving/issues/712#issuecomment-477000900, or mute the thread https://github.com/notifications/unsubscribe-auth/AL_uU50ChuO6BsZMjkIv42s6csnyt9m_ks5vaxLJgaJpZM4RRa9b .
Hi
Another thing, when I try to export the model with your code I get this error
File "/home/baishali/MLPerf/inference/cloud/translation/gnmt/tensorflow/nmt/inference.py", line 130, in inference hparams) File "/home/baishali/MLPerf/inference/cloud/translation/gnmt/tensorflow/nmt/inference.py", line 161, in single_worker_inference infer_model.batch_size_placeholder: hparams.infer_batch_size File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 900, in run run_metadata_ptr) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1120, in _run self._graph, fetches, feed_dict_tensor, feed_handles=feed_handles) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 427, in init self._fetch_mapper = _FetchMapper.for_fetch(fetches) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 242, in for_fetch type(fetch))) TypeError: Fetch argument None has invalid type <class 'NoneType'>
This is because the iterator is none, so how did you solve this?
thanks
On Wed, Mar 27, 2019 at 6:05 AM Baishali Chaudhury < baishali.chaudhury@gmail.com> wrote:
Hi
Thanks for your reply. Do you have a readme to briefly explain some of the changes you had to make in the code. Also out of curiosity, did you try restroring the saved .pb file and making an test translation or inference without a client based approach. So for example I want to make a simple test inference code, which will load the saved .pb model, restore placeholders, read an input file with few sentences, (preoprocess the strings maybe?) and translate . My confusions in this approach is how to restore the iterator and initialize before inference. Do you have any experience with this?
Thanks for all your help.
Regards
On Tue, Mar 26, 2019 at 11:42 PM Harshvardhan Singh < notifications@github.com> wrote:
Hi @baishalichaudhury https://github.com/baishalichaudhury @B1gMinnow https://github.com/B1gMinnow , i have attached my code here..you guys can check nmt.zip https://github.com/tensorflow/serving/files/3011842/nmt.zip
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tensorflow/serving/issues/712#issuecomment-477000900, or mute the thread https://github.com/notifications/unsubscribe-auth/AL_uU50ChuO6BsZMjkIv42s6csnyt9m_ks5vaxLJgaJpZM4RRa9b .
@baishalichaudhury I have mailed u the steps taken to export the model
@harshS26 I used your code for exporting the model and successfully exported but while using client I am getting only one word Output. For example: "input": "when will you pay", "output": "when". Could you guide me what changes should I do ?
hey @dharm033075 sorry for late reply, did you check my export.py file is it the same? also did u change model_helper.py, model.py & nmt.py?
I have trained nmt models, but I can not understand how to export the models to tensorflow serving. I read the documentations of MNIST and Inception, but I think these models are different with nmt models. Can you add a demo to show how to export the nmt models? This would be a great help to beginners like me, thanks!