SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
Apache License 2.0
57 stars 54 forks source link

Closes #425 | Add Dataloader MaXM #553

Closed akhdanfadh closed 3 months ago

akhdanfadh commented 3 months ago

Closes #425

There is no subset specified in the homepage, but there are two files for one language: (1) regular QA, and (2) yes-no QA. I assumed each should be a subset (open to discuss). Thus, configs will look like this: maxm_regular_source, maxm_yesno_seacrowd_imqa, etc. When testing, pass maxm_<subset> to the --subset_id parameter.

Checkbox