SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
Apache License 2.0
65 stars 57 forks source link

Closes #275 | Create dataset loader for UIT-ViCoV19QA #275 #463

Closed Gyyz closed 6 months ago

Gyyz commented 7 months ago

Closes #275

Checkbox

Gyyz commented 7 months ago

Scripts Passed.

  1. subset = uitvicov19qa1/2/3/4_ans: python -m tests.test_seacrowd seacrowd/sea_datasets/uit_vicov19qa/uit_vicov19qa.py --subset uit_vicov19qa_4_ans

  2. make check_file=seacrowd/sea_datasets/uit_vicov19qa/uit_vicov19qa.py

Gyyz commented 7 months ago

Checked, LGTM.

One question: Is there any reasons why we have 4 subsets? Maybe we can just unite them into one, they are similar to me. What do you think @raileymontalan @Gyyz ?

I am OK with merging to just one

raileymontalan commented 7 months ago

Checked, LGTM.

One question: Is there any reasons why we have 4 subsets? Maybe we can just unite them into one, they are similar to me. What do you think @raileymontalan @Gyyz ?

Agree. @Gyyz please implement only one subset here.

Gyyz commented 7 months ago

Checked, LGTM. One question: Is there any reasons why we have 4 subsets? Maybe we can just unite them into one, they are similar to me. What do you think @raileymontalan @Gyyz ?

Agree. @Gyyz please implement only one subset here.

@MJonibek @raileymontalan, hello, updated, please check. Scripts Passed: 1.python -m tests.test_seacrowd seacrowd/sea_datasets/uit_vicov19qa/uit_vicov19qa.py 2.make check_file=seacrowd/sea_datasets/uit_vicov19qa/uit_vicov19qa.py

Gyyz commented 7 months ago

Checked, LGTM! One small nit: can you rerun make check_file=seacrowd/sea_datasets/uit_vicov19qa/uit_vicov19qa.py command, I see some changes after running it:

image

@MJonibek Yes, after make script, the dictionary structure of the sample will be changed, and all keys and values will be placed on the same line, which affects the code's readability a bit. So I will adjust the code again on the example dict after make script