bigscience-workshop / data_tooling

Tools for managing datasets for governance and training.
Apache License 2.0
77 stars 48 forks source link

Create dataset uit_vsfc_vietnamese_students_feedback_corpus #109

Closed albertvillanova closed 2 years ago

albertvillanova commented 2 years ago
albertvillanova commented 2 years ago

Temporarily: https://huggingface.co/datasets/albertvillanova/vietnamese_students_feedback

I've contacted data custodians.

Reply from data custodians: OK to transfer their dataset to their organization namespace

Repo transferred from:

albertvillanova commented 2 years ago

DONE: https://huggingface.co/datasets/bigscience-catalogue-lm-data/lm_vi_vietnamese_students_feedback

Sample:


{'text': 'slide giáo trình đầy đủ .', 'meta': "{'split': 'train'}"}