IndoNLP / nusa-crowd

A collaborative project to collect datasets in Indonesian languages.
Apache License 2.0
261 stars 61 forks source link

Create dataset loader for MultiLexNorm #152

Closed SamuelCahyawijaya closed 2 years ago

SamuelCahyawijaya commented 2 years ago

https://indonlp.github.io/nusa-catalogue/card.html?multilexnorm

SamuelCahyawijaya commented 2 years ago

this one can be framed as a paraphrasing task using nusantara_t2t schema with text_1 denotes the original sentence and text_2 denotes the normalized sentence

yana-xuyan commented 2 years ago

self-assign

Iamfinethanksu commented 2 years ago

self-assign