I have been following this tutorial and Tensor-2-Tensor translation walkthrough" to train a French-to-English translation system using the transformer network with _transformer_base_single_gpu_ hyperparameter set. Once I have a trained model, is there an easy way to obtain the word-embeddings from the encoder before translation?
And if I can, to generate a sentence representation can I do something similar to this:
The context-aware word representations (obtained from the encoding
sub-graph) are converted to a fixed length sentence encoding vector
by computing the element-wise sum of the representations
at each word position and divide by the square root of the length of the sentence. The encoder
takes as input a lowercased PTB tokenized string
and outputs a 512-dimensional vector as the sentence
embedding.
as mentioned in the "Universal Sentence Encoder" paper to obtain an embedding.
Description
I have been following this tutorial and Tensor-2-Tensor translation walkthrough" to train a French-to-English translation system using the transformer network with _transformer_base_single_gpu_ hyperparameter set. Once I have a trained model, is there an easy way to obtain the word-embeddings from the encoder before translation?
And if I can, to generate a sentence representation can I do something similar to this: The context-aware word representations (obtained from the encoding sub-graph) are converted to a fixed length sentence encoding vector by computing the element-wise sum of the representations at each word position and divide by the square root of the length of the sentence. The encoder takes as input a lowercased PTB tokenized string and outputs a 512-dimensional vector as the sentence embedding. as mentioned in the "Universal Sentence Encoder" paper to obtain an embedding.