SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
Apache License 2.0
55 stars 54 forks source link

Closes #612 | Add Dataloader AC-IQuAD #641

Closed muhammadravi251001 closed 1 month ago

muhammadravi251001 commented 2 months ago

Title: Add Dataloader AC-IQuAD

First line PR Message: Closes https://github.com/SEACrowd/seacrowd-datahub/issues/612

Checkbox

sabilmakbar commented 1 month ago

2 more comments:

  1. Did you happen to not implementing the meta field for SEACrowd QA? Prob you might want to pull the latest QA schema first to fix it
  2. I saw in complex subset data, it has tipe key (prob type key in Indonesian), which has 4 distinct values for train (image 1) and 5 for test (image 2) image image

Would you like to add it to the meta field as well for SEACrowd and introduce it as a new feature for the complex source schema?

muhammadravi251001 commented 1 month ago

A quick comment: I saw some inconsistencies when addressing single/simple subset names. Would you mind amending this first, @muhammadravi251001? Thanks!

This is done by changing the single subset name to simple. From the paper & dataset itself, the author uses single and simple interchangeably -> single for the dataset name and simple for the explanation on the paper. Done on https://github.com/SEACrowd/seacrowd-datahub/pull/641/commits/ef563f67f22fd667e440cfc785b4f1cb230ef94a commit.

  1. Did you happen to not implementing the meta field for SEACrowd QA? Prob you might want to pull the latest QA schema first to fix it

I've already added the meta field on the https://github.com/SEACrowd/seacrowd-datahub/pull/641/commits/735cd2e5b77a51f29fe46cc5dd2740478bc0e99f commit.

  1. I saw in complex subset data, it has tipe key (prob type key in Indonesian), which has 4 distinct values for train (image 1) and 5 for test (image 2)

image image Would you like to add it to the meta field as well for SEACrowd and introduce it as a new feature for the complex source schema?

Alright, done in https://github.com/SEACrowd/seacrowd-datahub/pull/641/commits/f2aec29b6fe595cab840751c3db8ecbe5e8bfe8c commit.

Thanks for the careful review, Sir!

muhammadravi251001 commented 1 month ago

lgtm

Thanks for the approval, Sir!

muhammadravi251001 commented 1 month ago

Hi! No comment from me👍 lgtm

Thanks for the approval, Sir!

muhammadravi251001 commented 1 month ago

Since both reviewers approved, I will continue to squash & merge this PR. Thanks for the review!