Reading: Learning Cross-Lingual Sentence Representations via a Multi-task Dual-Encoder Model

a1da4 commented 5 years ago

0. Paper

@inproceedings{chidambaram-etal-2019-learning, title = "Learning Cross-Lingual Sentence Representations via a Multi-task Dual-Encoder Model", author = "Chidambaram, Muthu and Yang, Yinfei and Cer, Daniel and Yuan, Steve and Sung, Yunhsuan and Strope, Brian and Kurzweil, Ray", booktitle = "Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)", month = aug, year = "2019", address = "Florence, Italy", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/W19-4330", doi = "10.18653/v1/W19-4330", pages = "250--259", }

Article is here

1. What is it?

In this paper, the authors proposed a novel approach for cross-lingual representation learning using Universal Sentence Encoder.

2. What is amazing compared to previous studies?

They construct a multitask training scheme using

native source|target language tasks
bridging translation task

3. Where is the key to technologies and techniques?

The key is Multi-Task Dual-Encoder Model.

Input sentence sIi and response sentence sRi, and seek to rank sRi over all other possible response sentences.
Maximize the log-likelihood, P(sRi|sIi) for each task.

However, P(sRi|sIi) is hard to calculated, so they used P'(sRi|sIi) as below:

They used 2 USE (Transformer encoder based) to embed each sentence. To calculate the sentence representation, they used the average of each position of words in a sentence.

4. How did validate it?

They evaluated their learned representation using monolingual and cross-lingual tasks. Their model achieved near-state-of-the-art or state-of-the-art performance on a variety of English tasks.

5. Is there a discussion?

6. Which paper should read next?

thak123 commented 4 years ago

Did you get any reference code for this paper ?

a1da4 commented 4 years ago

Did you get any reference code for this paper ?

Sorry, I didn’t. But you can use pre-trained model. [TensorFlow Hub]

thak123 commented 4 years ago

No I wanted to train the model from scratch to reproduce the results

a1da4 / paper-survey