CLIcK (Cultural and Linguistic Intelligence in Korean) is a comprehensive dataset designed to evaluate cultural and linguistic intelligence in the context of Korean language models. In an era where diverse language models are continually emerging, there is a pressing need for robust evaluation datasets, especially for non-English languages like Korean. CLIcK fills this gap by providing a rich, well-categorized dataset focusing on both cultural and linguistic aspects, enabling a nuanced assessment of Korean language models.
The CLIcK benchmark comprises two broad categories: Culture and Language, which are further divided into 11 fine-grained subcategories.
Language π£οΈ
Culture π
CLIcK was developed using two human-centric approaches:
The dataset is organized as follows, with each subcategory containing relevant JSON files:
π¦CLIcK
ββ Dataset
ββ Culture
β ββ [Each cultural subcategory with associated JSON files]
ββ Language
ββ [Each language subcategory with associated JSON files]
Models | Average Accuracy (Korean Culture) | Average Accuracy (Korean Language) |
---|---|---|
Polyglot-Ko 1.3B | 32.71% | 22.88% |
Polyglot-Ko 3.8B | 32.90% | 22.38% |
Polyglot-Ko 5.8B | 33.14% | 23.27% |
Polyglot-Ko 12.8B | 33.40% | 22.24% |
KULLM 5.8B | 33.79% | 23.50% |
KULLM 12.8B | 33.51% | 23.78% |
KoAlpaca 5.8B | 32.33% | 23.87% |
KoAlpaca 12.8B | 33.80% | 22.42% |
LLaMA-Ko 7B | 33.26% | 25.69% |
LLaMA 7B | 35.44% | 27.17% |
LLaMA 13B | 36.22% | 26.71% |
GPT-3.5 | 49.30% | 42.32% |
Claude2 | 51.72% | 45.39% |
The CLIcK dataset is available on the Hugging Face Hub: CLIcK Dataset
If you use CLIcK in your research, please cite our paper:
@misc{kim2024click,
title={CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean},
author={Eunsu Kim and Juyoung Suk and Philhoon Oh and Haneul Yoo and James Thorne and Alice Oh},
year={2024},
eprint={2403.06412},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
For any questions or inquiries, please contact kes0317@kaist.ac.kr.