eg-nlp-community / nlp-reading-group

12 stars 0 forks source link

[22/03/2020] 4pm GMT+2 - Zero-Shot Cross-Lingual Transfer with Meta Learning #4

Closed Omarito2412 closed 4 years ago

Omarito2412 commented 4 years ago

Join us in our discussion, our paper's abstract is:

In this paper, we consider the setting of training models on multiple different languages at the same time, when little or no data is available for languages other than English. We show that this challenging setup can be approached using meta-learning, where, in addition to training a source language model, another model learns to select which training instances are the most beneficial. We experiment using standard supervised, zero-shot cross-lingual, as well as few-shot cross-lingual settings for different natural language understanding tasks (natural language inference, question answering).

We'll be meeting on Google hangouts: https://hangouts.google.com/group/kUxBAunjGittAkBUA

URL: https://arxiv.org/abs/2003.02739

hadyelsahar commented 4 years ago

Interesting paper showing a simple technique for zero/few shot cross-lingual transfer, the proposed technique relies on finetuning on a set of other small languages sizes before eval. on the target language.

pros: interesting literature and experiment setup that is to follow cons: the method they follow is quite straight forward, imo its connections with meta learning are not very well established, gains over multilingual bert baseline are quite minor.

Some refs we mentioned today:

Crosslingual word embeddings for zero shot transfer and Unsupervised MT https://www.samtalksml.net/aligning-vector-representations/ https://github.com/facebookresearch/UnsupervisedMT ACL 2019 tutorial on the same topic https://ruder.io/unsupervised-cross-lingual-learning/

Changing the seed affects results quite much (BERT finetuning context): https://arxiv.org/abs/2002.06305

Omarito2412 commented 4 years ago

The technique mentioned is shallow, but it looks nice and effective in some cases. Although, there are a lot of questions on whether this really is meta-learning or not.