Closed mohammedayub44 closed 3 years ago
Issue #193 contains an explanation of the outputs you can get from this implementation. Unlike the TF Hub implementation, the bilm-tf implementation can't directly give you a weighted sum of the three output layers. You can, however, weight the three output layers yourself, for example by including a keras WeightedAverage layer in a model that's consuming the ELMo embeddings. Note that the first of the three output layers from lm_embeddings
contains the character-based representations you wanted.
Here's a code snippet for getting the three output layers. Also note that per my comment on issue #107, this code requires the model saved in Step 1, not the final model with TF serving tags. When you're ready to deploy the model in TF serving, use the model saved in Step 2.
import tensorflow as tf
from bilm import Batcher, BidirectionalLanguageModel
# load the saved model
frozen_graph = '/path/to/my_saved_model.pb'
with tf.gfile.GFile(frozen_graph, "rb") as f:
restored_graph_def = tf.GraphDef()
restored_graph_def.ParseFromString(f.read())
with tf.Graph().as_default() as graph:
tf.import_graph_def(
restored_graph_def,
input_map=None,
return_elements=None,
name="")
output_node = graph.get_tensor_by_name("concat_3:0")
input_node = graph.get_tensor_by_name("Placeholder:0")
# generate character ids for your input documents
vocab_file = '/path/to/my_vocab.txt'
batcher = Batcher(vocab_file, 50)
char_ids = batcher.batch_sentences([["Hello", "world"]])
# get embeddings
sess = tf.Session(graph=graph)
my_feed_dict = {input_node: char_ids}
embs = sess.run(output_node, feed_dict=my_feed_dict)
Also, one more note: this model produces very large outputs. When you deploy the model in TF serving, the embeddings have to be serialized to be returned to you. If you're then feeding them to another model, they will have to be de-serialized. The serialization/deserialization steps are time-consuming, and it would be faster to deploy the models in native tensorflow, rather than via TF serving, so that the embeddings can be passed directly to the downstream model as numpy arrays, skipping the serialization/deserialization steps.
@carolmanderson Great. Thanks for in the detailed code snippet.
I was using Streamlit to build my prototype app. All my other word embedding models are using Tensorflow 2 and are natively loaded from checkpoints. Since this repo doesn't support TF2.0. I had to go down this route of including them as REST endpoints.
Good point about output size. I'm passing independent sentences. Does that depend on batch size or no of sentences. My guess is simple python pickling should work ?
@mohammedayub44 ah, ok. In that case, you can export the model as described in #107 and reload it in Tensorflow 2 within your Streamlit app. Here's sample code (caveat: I haven't run this in a Streamlit app. But I have confirmed it works in Tensorflow 2):
import tensorflow as tf
from bilm import Batcher
# reload the model
loaded = tf.saved_model.load("/path/to/saved/model") # this is a directory. Don't include the file itself in the path.
infer = loaded.signatures["serving_default"]
# get the char ids for your documents
vocab_file = '/path/to/my_vocab.txt'
batcher = Batcher(vocab_file, 50)
char_ids = batcher.batch_sentences([["Hello", "world"]])
char_ids = char_ids.astype('int32') # must be cast to int32 before feeding to model
# get embeddings
embs = infer(tf.constant(char_ids))['import/concat_3:0']
Don't be alarmed if you see this message: INFO:tensorflow:Saver not created because there are no variables in the graph to restore.
This is expected.
Regarding the output size, you'll get a 3 x 1024 tensor for every token in your input. So long documents or large batches can both cause large outputs.
@carolmanderson Thanks. Could you verify the lines to be commented in #107 . The link did not work unfortunately.
Sorry about that. The lines are:
No problem. It works smoothly in Tensorflow 2. Guess I will skip the serving part for now as loading natively works better for me using Streamlit.
Hi,
Thanks for the excellent work on repo. I was able to train and finetune a custom model using this. Also able to test the model with checkpoints successfully. However, my need is to use the model in Tensorflow Serving, hence the requirement of
SavedModel
to avoid any issues.So far, I managed to test using Checkpoint using below code:
Basic outline on how the load and save a builder object, using below code:
I'm slightly confused on how my Signature Def should look like and how to account for any other pre-processing operations and layers etc.
I want it to be something similar to what Tf Hub-ELMO 3 has or atleast support the following:
Any help appreciated. Thanks in advance!
@matt-peters