SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
Apache License 2.0
64 stars 57 forks source link

Closes #12 | Add/Update Dataloader BalitaNLP #550

Closed raileymontalan closed 6 months ago

raileymontalan commented 6 months ago

Closes #12

Notes

There are many articles in the dataset whose corresponding image file does not exist in the repository. Reporting the statistics here:

Checkbox

raileymontalan commented 6 months ago

Hi @raileymontalan, looks good to me! I have one suggestion though.

Could you please add the author, category, date, img_url, url, and website under the metadata in the seacrowd_imtext schema?

Details added to metadata. Ready for review, thanks!