tensorflow / tensor2tensor

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Apache License 2.0
15.3k stars 3.47k forks source link

Obtain context-aware word embeddings from an encoder sub-graph of a pre-trained machine translation model #1087

Open kgramm9026 opened 5 years ago

kgramm9026 commented 5 years ago

Description

I have been following this tutorial and Tensor-2-Tensor translation walkthrough" to train a French-to-English translation system using the transformer network with _transformer_base_single_gpu_ hyperparameter set. Once I have a trained model, is there an easy way to obtain the word-embeddings from the encoder before translation?

And if I can, to generate a sentence representation can I do something similar to this: The context-aware word representations (obtained from the encoding sub-graph) are converted to a fixed length sentence encoding vector by computing the element-wise sum of the representations at each word position and divide by the square root of the length of the sentence. The encoder takes as input a lowercased PTB tokenized string and outputs a 512-dimensional vector as the sentence embedding. as mentioned in the "Universal Sentence Encoder" paper to obtain an embedding.

guotong1988 commented 5 years ago

赞!