SamuelCahyawijaya commented 12 months ago

Dataset	mtop
Description	An almost-parallel multilingual task-oriented semantic parsing dataset covering 6 languages and 11 domains. This is the first multilingual dataset which contains compositional representations that allow complex nested queries.
Subsets	Domain, Intent
Languages	tha
Tasks	Constituency Parsing, Semantic Role Labeling, Intent Classification
License	Unknown (unknown)
Homepage	https://huggingface.co/datasets/mteb/mtop_domain, https://huggingface.co/datasets/mteb/mtop_intent
HF URL	https://huggingface.co/datasets/mteb/mtop_domain, https://huggingface.co/datasets/mteb/mtop_intent
Paper URL	https://arxiv.org/abs/2008.09335

elyanah-aco commented 11 months ago

self-assign

sabilmakbar commented 11 months ago

Hi @elyanah-aco, may I know the current status of this dataloader creation? Feel free to discuss here if you have any difficulties. Thanks!

elyanah-aco commented 11 months ago

@sabilmakbar Still working on my other 2 dataloaders, will get to this once those are done

sabilmakbar commented 11 months ago

Cool, thanks for letting us know! Please take your time working w/ other dataloaders first and let us know if you find any roadblocks!

elyanah-aco commented 11 months ago

Hi @sabilmakbar, what should I do if there are two subsets with different labels for the same schema (domain and intent under text schema)?

sabilmakbar commented 11 months ago

You may separate them by implementing per config; as a consequence you'll have two times the usual configs (for domain and intent)

SEACrowd / seacrowd-datahub