Open a1da4 opened 7 months ago
They address two misalignment of zero-shot cross-lingual transfer tasks:
Their method improves the performance in zero-shot cross-lingual transfer.
isotropy enhancement: they use Wasserstein distance as a loss function (close to Gaussian distribution -> more isotropic)
constrained code-switching: randomly specifies a word in a sentence and replaces it with a word from the bilingual dictionary.
0. Paper
1. What is it?
They address two misalignment of zero-shot cross-lingual transfer tasks:
2. What is amazing compared to previous works?
Their method improves the performance in zero-shot cross-lingual transfer.
3. Where is the key to technologies and techniques?
isotropy enhancement: they use Wasserstein distance as a loss function (close to Gaussian distribution -> more isotropic)
constrained code-switching: randomly specifies a word in a sentence and replaces it with a word from the bilingual dictionary.
4. How did evaluate it?
5. Is there a discussion?
6. Which paper should read next?