Closed ghost closed 6 years ago
I'm curious on how these assets are to be loaded when serving? Are there additional ops needed to be defined to load them?
assets --- int2word --- word2int
Hi there,
TF-Serving supports loading assets out of the box, without needing to drop down to writing a custom c++ servable. In particular, the SavedModel loader finds and loads the assets. See https://www.tensorflow.org/programmers_guide/saved_model ; and if needed you can look at the implementation of loading assets here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/cc/saved_model/loader.cc.
Regarding your transformation question, you might want to post to a tf.transform forum.
Chris
Hi @chrisolston I have the same problem with @iyukuni. As you mentioned above, TF-serving will load the assets for us, but my question is how and when can I build the word2int and int2word from the loaded files? so that I can do some pre-processing for the input strings and post-processing for the output ids.
@xiaoxiaoyuwen did you fix this? face the same issue.
I also have the same doubt. How does the tensorflow serving accesses the assets? Is there any working example? In doc - https://www.tensorflow.org/programmers_guide/saved_model assets_collection is mentioned but how it has been used in the graph is not clear.
@jacks808 , @xiaoxiaoyuwen , were you able to get to the solution?
I have the same question! There is plenty of documentation on how to include assets in the SavedModel, but how do I access and use these assets in the graph? If my asset file is a vocabulary (a list of words in order of frequency), how do I get a tensor that I can then use? There does not seem to be any examples of this online, or at least any that I can understand.
I'm wondering if there is a working example on how to exploit files in assets
folder.
For using the asserts I'm interested in seeing additional examples as well. @a-a-e I been looking to do this as well, to my understanding is you would use the main_op to initialize the tables and then lookup the value in your model. Then to my understanding it becomes a balance of clients vs tensorflow model processing.
The main_op is used hear https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/saved_model/saved_model_half_plus_two.py
Mentions of asserts are sprinkled around like https://github.com/tensorflow/tensorflow/tree/r1.11/tensorflow/python/saved_model
Support for Assets. For cases where ops depend on external files for initialization, such as vocabularies, SavedModel supports this via assets. Assets are copied to the SavedModel location and can be read when loading a specific meta graph def.
More examples using assets would be great to show good use cases
@gautamvasudevan
For vocabulary assets, I recommend using the tf.contrib.lookup
module in your code. It manages assets for you, from registering them to loading them at serving time. See in particular:
Speaking of real examples, we use this in addition to Estimators in OpenNMT-tf to easily support model export and serving.
Preprocessing is an open question though but there is work to make a SentencePiece TensorFlow op which would solve it for simple use cases in machine translation.
@a-a-e Did you solve the issue ?
I think @guillaumekln has the right idea here. Closing due to lack of activity.
Should we manually copy required files into assets folder to successfully export? Ref: export_half_plus_two.py
I've been thinking that based on the tensor's requirement upon training, the required files are automatically copied to assets folder. Is that wrong assumption?
Please have a look here
Also, should we manually specify the required initializations through init_op
?
If you are referring to the functions tf.contrib.lookup.index_table_from_file
and tf.contrib.lookup.index_to_string_table_from_file
, they should do that for you, from saving the vocabulary in the asset folder to loading it automatically at serving time.
If you are referring to the functions
tf.contrib.lookup.index_table_from_file
andtf.contrib.lookup.index_to_string_table_from_file
, they should do that for you, from saving the vocabulary in the asset folder to loading it automatically at serving time.
It gives, TypeError: index_table_from_file() got an unexpected keyword argument 'key_column_index'
as referred here though key_column_index
is acceptable as per official doc. Could not get that.
However, I managed to do with tf.contrib.lookup.TextFileStringTableInitializer
and tf.contrib.lookup.HashTable
. It automatically copies when everything is referred as Tensor and assets_collection=tf.get_collection(tf.GraphKeys.ASSET_FILEPATHS)
is passed as parameter to Builder. Thank you for your help. Also, this was helpful too.
I'm now a bit stuck on how to go about the I/O and pre-processing of seq2seq. The default SignatureDef
shows the inputs are,
inputs['source_ids'] tensor_info:
dtype: DT_INT64
shape: (-1, -1)
name: model/att_seq2seq/hash_table_1_Lookup:0
inputs['source_len'] tensor_info:
dtype: DT_INT32
shape: (-1)
name: model/att_seq2seq/Minimum:0
inputs['source_tokens'] tensor_info:
dtype: DT_STRING
shape: (-1, -1)
name: model/att_seq2seq/strided_slice:0
But, as of now, I'm giving only inputs['source_tokens']
. The inputs['source_len']
and inputs['source_ids']
are generated upon preprocessing in python.
How do I go about this?
inputs['source_tokens']
as inputinputs['source_tokens']
, inputs['source_len']
and inputs['source_ids']
The solution should work for both training and infer.
Also, how did the default SignatureDef
comes like that(as mentioned above)?
Are they based on the input to encoder?
Thanks in advance
Using the existing model in TF Serving without any lookup table
and with default SignatureDef
, POST requested with following body(Assuming no lookup tables needed if process starts directly at encode() and hence the respective SignatureDef
),
"inputs": {
"source_tokens": "['AND','COME','SO']",
"source_ids": [3, 7, 5],
"source_len": [3]
}
source_ids
are manually found and added. source_len
is calculated.
However, it results in,
{
"error": "len(seq_lens) != input.dims(0), (1 vs. 3)\n\t [[{{node model/att_seq2seq/encode/bidi_rnn_encoder/bidirectional_rnn/bw/ReverseSequence}}]]"
}
In (1 vs. 3)
above, term in place of 3 changes as per length of source_ids
Corresponding TF Serving console,
2019-03-05 13:27:41.699119: W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at transpose_op.cc:157 : Invalid argument: transpose expects a vector of size 2. But input(1) is a vector of size 3
Any help is apprectiated.
Good morning everyone,
I created a seq2seq model with attention for a chatbot and need to deploy it to a production server using tensorflow serving. I followed the basic tutorials available and read the documentation / open issues here, but it seems to me that there is no straightforward way to deploy the model.
At the moment I can make predictions in the following way, first I define the graph:
Then I load the model checkpoints and pass the test data through a preprocessing pipeline:
I think that the main problems to make this model work with tensorflow serving can be summarised as following:
Based on my current research I came to the following conclusions:
1. Data Processing
The "ideal" way to handle this should be by using tf.Transform (https://github.com/tensorflow/serving/issues/663), but as indicated in the post it doesn't support many operations such as lowercase or regex and is therefore not suitable for such text based models.
As tf.Transform doesn't work I can think about 2 alternatives to solve the issue:
Does this approach make sense or am I completely missing something?
2. Assets
The second issue that needs to be solved is how to use assets, in this case the two dictionaries int2word and word2int which are required for the model. I managed to store them by modifying the following code that was only working for strings (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/saved_model/saved_model_half_plus_two.py)
Then I transformed the model into protobuf format to make it ready for serving.
This leads to the following folder structure:
What I don't understand is how these assets can / should be used by the tensorflow serving model. They are needed by several functions within the graph but are not feed to it. They are basically like global variables. Do I just need to load them into the client and magic will happen or can I somehow load them into the graph avoiding to export them as assets? This leads to problem nr. 3.
3. C++ custom servable
It seems to me that there is no way around creating a C++ custom servable if someone wants to use assets (https://www.tensorflow.org/serving/custom_servable). If I understand it correctly, I can still use my Python client and just need to change the serving part to use C++.
This is the part that I don't understand at all, and having no C++ experience doesn't make it better. I don't really know where to start and if I don't get it working I'm in big trouble.
In order to use tensorflow serving I created a docker image using the Dockerfile.devel in the repository, then I downloaded the tensorflow serving repo and used:
bazel build -c opt tensorflow_serving/...
I can then run the following command to serve a model:
bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=9000 --model_name=rnn --model_base_path=rnn-export &> rnn_log &
According to the documentation it looks like a new Loader and Source Adapter should be created. However, I couldn't find any clear examples about how to do this. It is also not clear to me which files I need to change and how to run the code.
Do I have to change the core/loader.h, core/simple_loader.h and main.cc files? Are there any examples? I have no idea where to start.
Are there any other ways in order to deploy the model to production or how can I solve this? I find the whole process very complex for such a simple model architecture. Maybe my approach is wrong and I'm missing something, as I couldn't find much information online. This should be a normal problem that people are exposed to on a daily basis.
Would be great if you have some ideas how to solve this.