SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
Apache License 2.0
60 stars 56 forks source link

Create dataset loader for Thai-joke-corpus #525

Open SamuelCahyawijaya opened 5 months ago

SamuelCahyawijaya commented 5 months ago

Dataloader name: thai_joke_corpus/thai_joke_corpus.py DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?thai_joke_corpus

Dataset thai_joke_corpus
Description Thai jokes scraped from 4 Thai jokes facebook pages collected by iApp Technology Co, Ltd.
Subsets -
Languages tha
Tasks Language Modeling
License GNU General Public License v3.0 (gpl-3.0)
Homepage https://github.com/iapp-technology/thai-joke-corpus
HF URL -
Paper URL https://github.com/iapp-technology/thai-joke-corpus
joanitolopo commented 5 months ago

self-assign