tensorflow / serving

A flexible, high-performance serving system for machine learning models
https://www.tensorflow.org/serving
Apache License 2.0
6.18k stars 2.19k forks source link

'error': 'transpose expects a vector of size 6. But input(1) is a vector of size 3'. Issue with bidirectional compatibility with tensorflow serving with batch/sample number dimension. #1269

Closed bigbizze closed 5 years ago

bigbizze commented 5 years ago

Issue with batch dimension when using tensorflow serving...

metadata: {'model_spec': {'name': 'prod_mod', 'signature_name': '', 'version': '1'}, 'metadata': {'signature_def': {'signature_def': {'serving_default': {'inputs': {'input': {'dtype': 'DT_FLOAT', 'tensor_shape': {'dim': [{'size': '-1', 'name': ''}, {'size': '50', 'name': ''}], 'unknown_rank': False}, 'name': 'embedding_1_input:0'}}, 'outputs': {'output': {'dtype': 'DT_FLOAT', 'tensor_shape': {'dim': [{'size': '-1', 'name': ''}, {'size': '1', 'name': ''}], 'unknown_rank': False}, 'name': 'output/Sigmoid:0'}}, 'method_name': 'tensorflow/serving/predict'}}}}}

error response:

<Response [400]>
{'error': 'transpose expects a vector of size 6. But input(1) is a vector of size 3\n\t [[{{node bidirectional_1/transpose}} = Transpose[T=DT_FLOAT, Tperm=DT_INT32, _class=["loc:@bidirectional_1/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3"], _output_shapes=[[50,?,300]], _device="/job:localhost/replica:0/task:0/device:CPU:0"](embedding_1/embedding_lookup, Attention/transpose/perm)]]'}

I am using Keras to build a model and am exporting it to a .pb file using "tf.saved_model.simple_save".

The input to my model is (num samples, 50) which becomes (none, 50) and when making a POST request to the tensorflow serving instance in Docker if I reshape my input so it is only one sample with the shape (50) I get a response & prediction as expected, but if I attempt to batch more than one sample together it gives me the above error.

I am using a custom layer in this model which is referenced at the end of the error "Attention/transpose/perm" but what is really confusing me is earlier when it says "_output_shapes=[[50,?,300]] which makes me think that it is moving the num samples to dimension 1 from 0 for some reason.

This is the command I am using to create the Docker tensorflow serving instance:

docker run -p 9000:9000 --name tfserve --mount type=bind,source=//c/Users/Charles/PycharmProjects/spamreader/serving/prod_mod,target=/models/prod_mod,target=/models/prod_mod -t --entrypoint=tensorflow_model_server tensorflow/serving --enable_batching --rest_api_port=9000 --model_name=prod_mod --model_base_path=/models/prod_mod

Is this an error of compatibility with Bidirectional layers and tensorflow serving?

I had opened up a similar Issue thread a few weeks ago but closed it after I had thought I had solved the issue but in reality I had just discovered that the prediction works as intended only when their is no sample dimension.

It looks like a similar issue as this one in terms of how it is manifesting even if the surrounding context is quite different: https://github.com/migueldeicaza/TensorFlowSharp/issues/288

Harshini-Gadige commented 5 years ago

Identify the size of the input tensor and change it accordingly as the error suggests.

Please refer this for more information.

bigbizze commented 5 years ago

Thanks for getting back to me.

As I said in my post, when the input is altered so as to remove the batch dimension entirely the model serves as intended. When I try to batch the input it throws this error.

The input data's shape works perfectly as intended with batching when I use the same data and test using Keras' model.predict() function.

I have further tested and tried the model without the bidirectional layers and without the attention layers and I get a similar but related issue.

With a simple Conv1D model when I try to keep the batch dimension in the data structure it gives me this error:

{'error': 'input must be 4-dimensional[1,1,1,50,300]\n\t [[{{node conv1d_1/convolution/Conv2D}} = Conv2D[T=DT_FLOAT, _output_shapes=[[?,1,50,50]], data_format="NHWC", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](conv1d_1/convolution/ExpandDims, conv1d_1/convolution/ExpandDims_1)]]'}

Is this potentially an error then with the functionality of using a Keras' embedding layer and tensorflow serving?

Edit: So this is getting bizarre, when I reformat the data so that it is one sample (shape (50,)) , the model accepts it correctly but it returns a prediction for every item in the length of the sample instead of one prediction for it

{'predictions': [[[0.997504], [0.997504], [0.997091], [0.996899], [0.997333], [0.997135], [0.997504], [0.997504], [0.997504], [0.997504], [0.997035], [0.996876], [0.997295], [0.997137], [0.997504], [0.997192], [0.997164], [0.997504], [0.997504], [0.997292], [0.997504], [0.997504], [0.997504], [0.997504], [0.997504], [0.997106], [0.997495], [0.997504], [0.997023], [0.997422], [0.997504], [0.997504], [0.997504], [0.997267], [0.997504], [0.997203], [0.997067], [0.997142], [0.996888], [0.997058], [0.997504], [0.997504], [0.997138], [0.99681], [0.996906], [0.997504], [0.996903], [0.996978], [0.997504], [0.997794]]]}

I am now extremely confused, it seems to be using the length of the sample itself as the batch size? How was it able to even take in this data if that's the case?

bigbizze commented 5 years ago

Okay so I think I found an important clue into the problem... My model, when compiled, has an output tensor shape of (None, 1), but when I use Keras' "model.save()" feature to actually save the model and then load it using model.load(), the output shape changes for some reason to (None, 50, 1), so the model when inferred upon using tensorflow serving thinks that the batch dimension is actually the input length dimension.

Edit: not sure how this relates actually, when I skip the model.save/model.load part, compile the model directly and just use load_weights() retaining the output shape of (None, 1) and attempt to serve it I get a similar but different error as before:

{'error': 'input must be 4-dimensional[1,1,10,50,300]\n\t [[{{node conv1d_1/convolution/Conv2D}} = Conv2D[T=DT_FLOAT, _output_shapes=[[?,1,50,50]], data_format="NHWC", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](conv1d_1/convolution/ExpandDims, conv1d_1/convolution/ExpandDims_1)]]'}

But at the very least when I remove the batch dimension the prediction is actually working for one sample, so maybe this was just an orthogonal issue to the first one:

{'predictions': [[0.427316]]}

It's almost like it is just assuming the batch size is one so when it isn't one it throws an error. Does each request need to be a single batch and the --enable_batching thing like batches them together or something?

bigbizze commented 5 years ago

Here is the model.summary() for reference:

https://ybin.me/p/f7d76069dd53dca7#l1MrPeSg6HEENkopJRCnP2twIGeSYSl3jJXcm44bK+k=

bigbizze commented 5 years ago

Okay so I'm fairly confident that the issue is that I assumed you could send multiple batches as one array to the model to run predictions as batches and in reality I needed to batch each sample of the data in the request payload and then the model will infer in batches.

I.e. my original attempt was:

payload = {

    "instances": [{"input": out}]
}

where "out" was an array with the shape (number of samples, input length)

When I actually needed to use:

payload = {

    "instances": [{"input": out},
                  {"input": out},
                   {"input": out}]
}

Where each "out" is a single sample included in the batch which is the request.

sathyarr commented 5 years ago

When I try to run Batch mode with structure,

{ "inputs":
       [
         {"source_ids": [ _some value_ ], "source_len": [ _some value_ ]},
         {"source_ids": [ _some value_ ], "source_len": [ _some value_ ]}
       ]
}

I get,

{ "error": "inputs is a plain value/list, but expecting an object as multiple input tensors required as per tensorinfo_map" }

lilao commented 5 years ago

@bigbizze: thanks for following up with your findings!

@sathyamoorthyrr: your question seems different from the original issue. Can you open a new issue?

DL-POWER commented 4 years ago
  • Windows 10 - Version 1809 OS Build 17763.292
  • Docker - Version 18.03.0-ce (Client) - Version 18.09.1 (Engine)
  • Tensorflow Serving was installed using the command "docker pull tensorflow/serving"
  • TensorFlow Serving - Version 1.12.0

Issue with batch dimension when using tensorflow serving...

metadata: {'model_spec': {'name': 'prod_mod', 'signature_name': '', 'version': '1'}, 'metadata': {'signature_def': {'signature_def': {'serving_default': {'inputs': {'input': {'dtype': 'DT_FLOAT', 'tensor_shape': {'dim': [{'size': '-1', 'name': ''}, {'size': '50', 'name': ''}], 'unknown_rank': False}, 'name': 'embedding_1_input:0'}}, 'outputs': {'output': {'dtype': 'DT_FLOAT', 'tensor_shape': {'dim': [{'size': '-1', 'name': ''}, {'size': '1', 'name': ''}], 'unknown_rank': False}, 'name': 'output/Sigmoid:0'}}, 'method_name': 'tensorflow/serving/predict'}}}}}

error response:

<Response [400]>
{'error': 'transpose expects a vector of size 6. But input(1) is a vector of size 3\n\t [[{{node bidirectional_1/transpose}} = Transpose[T=DT_FLOAT, Tperm=DT_INT32, _class=["loc:@bidirectional_1/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3"], _output_shapes=[[50,?,300]], _device="/job:localhost/replica:0/task:0/device:CPU:0"](embedding_1/embedding_lookup, Attention/transpose/perm)]]'}

I am using Keras to build a model and am exporting it to a .pb file using "tf.saved_model.simple_save".

The input to my model is (num samples, 50) which becomes (none, 50) and when making a POST request to the tensorflow serving instance in Docker if I reshape my input so it is only one sample with the shape (50) I get a response & prediction as expected, but if I attempt to batch more than one sample together it gives me the above error.

I am using a custom layer in this model which is referenced at the end of the error "Attention/transpose/perm" but what is really confusing me is earlier when it says "_output_shapes=[[50,?,300]] which makes me think that it is moving the num samples to dimension 1 from 0 for some reason.

This is the command I am using to create the Docker tensorflow serving instance:

docker run -p 9000:9000 --name tfserve --mount type=bind,source=//c/Users/Charles/PycharmProjects/spamreader/serving/prod_mod,target=/models/prod_mod,target=/models/prod_mod -t --entrypoint=tensorflow_model_server tensorflow/serving --enable_batching --rest_api_port=9000 --model_name=prod_mod --model_base_path=/models/prod_mod

Is this an error of compatibility with Bidirectional layers and tensorflow serving?

I had opened up a similar Issue thread a few weeks ago but closed it after I had thought I had solved the issue but in reality I had just discovered that the prediction works as intended only when their is no sample dimension.

It looks like a similar issue as this one in terms of how it is manifesting even if the surrounding context is quite different: migueldeicaza/TensorFlowSharp#288

Heyy Bro, i was reading this page and i found you wrote about custom layers.I also want to try some custom layers but after lot of research i didn't found anything relevant can u help me with it??

anyongjin commented 4 years ago

Hi, I'm facing the same problem as you. I am using Keras to build a model which has an output tensor shape of (None, 1), and exported it to a pb file using tf.saved_model.simple_save, I got an Error "transpose expects a vector of size 4. But input(1) is a vector of size 3'" while call the model served on tf serving with grpc, and it works as intended only when their is no sample dimension. finally, I found the problem:

                tensor_shape = [1] + list(data.shape)
                dims = [tensor_shape_pb2.TensorShapeProto.Dim(size=dim) for dim in tensor_shape]
                tensor_shape = tensor_shape_pb2.TensorShapeProto(dim=dims)
                tensor = tensor_pb2.TensorProto(
                    dtype=types_pb2.DT_FLOAT,
                    tensor_shape=tensor_shape,
                    float_val=list(data.reshape(-1)))

I replaced the tf.make_tensor_proto with the code above, and the tensor_shape is limited to one sample Everything is ok if the first line is replaced with "tensor_shape = list(data.shape)"
hope helpful for someone...