microsoft / GLUECoS

A benchmark for code-switched NLP, ACL 2020
https://microsoft.github.io/GLUECoS
MIT License
74 stars 57 forks source link

Questions on Machine Translation Task #82

Closed goru001 closed 2 years ago

goru001 commented 2 years ago

Hi,

Thanks for setting up this repo and leaderboard. I had following questions regarding machine translation task

  1. Do we already have a leaderboard for the MT task as well? If yes, can you please share the link. I could not find it in the README.
  2. From what I understand the original dataset from Prof. Alan Black's group from CMU was in English for document grounded conversations. Can you share the paper which has baseline results for the Machine Translation Hinglish dataset which is being used here?
  3. Is the Hinglish MT dataset used here same as the one in LINCE leaderboard with a different split? (LINCE Leaderboard didn't clearly specify from where the data is coming from, hence asking here in case you've already checked that since both leaderboards are trying to cater to code-switched languages and tasks)
Genius1237 commented 2 years ago
  1. Submissions haven't been added to the leaderboard. The leaderboard should be visible at https://microsoft.github.io/GLUECoS
  2. https://arxiv.org/abs/2202.09625 Section 4.1 describes how the data was collected, with relevant citations
  3. The same dataset is used in Lince.