SEACrowd / seacrowd-datahub

A collaborative project to collect datasets in SEA languages, SEA regions, or SEA cultures.
Apache License 2.0
57 stars 54 forks source link

Create dataset loader for ThaiSpoof #573

Open SamuelCahyawijaya opened 3 months ago

SamuelCahyawijaya commented 3 months ago

Dataloader name: thai_spoof/thai_spoof.py DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?thai_spoof

Dataset thai_spoof
Description Thai language dataset for spoof detection. The dataset consists of genuine speech signals and various types of spoofed speech signals.The spoofed speech dataset is generated using text-to-speech tools for the Thai language, synthesis tools, and tools for speech modification. Accessing the dataset requires creating a (free) account on the AI for Thai portal.
Subsets -
Languages tha
Tasks Hoax Detection, Spoken Language Understanding
License Creative Commons Attribution Non Commercial Share Alike 3.0 (cc-by-nc-sa-3.0)
Homepage https://gofile-3732576a73.sg3.quickconnect.to/sharing/qsx4L5HJW
HF URL -
Paper URL https://ieeexplore.ieee.org/document/10354956