rladmstn1714 / CLIcK

CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean
39 stars 1 forks source link

CLIcK πŸ‡°πŸ‡·πŸ§ 

CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean

Dataset Paper

Introduction πŸŽ‰

CLIcK (Cultural and Linguistic Intelligence in Korean) is a comprehensive dataset designed to evaluate cultural and linguistic intelligence in the context of Korean language models. In an era where diverse language models are continually emerging, there is a pressing need for robust evaluation datasets, especially for non-English languages like Korean. CLIcK fills this gap by providing a rich, well-categorized dataset focusing on both cultural and linguistic aspects, enabling a nuanced assessment of Korean language models.

News πŸ“°

Dataset Description πŸ“Š

The CLIcK benchmark comprises two broad categories: Culture and Language, which are further divided into 11 fine-grained subcategories.

Categories πŸ“‚

Construction πŸ—οΈ

CLIcK was developed using two human-centric approaches:

  1. Reclassification of official and well-designed exam data into our defined categories.
  2. Generation of questions using ChatGPT, based on official educational materials from the Korean Ministry of Justice, followed by our own validation process.

Structure πŸ›οΈ

The dataset is organized as follows, with each subcategory containing relevant JSON files:

πŸ“¦CLIcK
 └─ Dataset
    β”œβ”€ Culture
    β”‚  β”œβ”€ [Each cultural subcategory with associated JSON files]
    └─ Language
       β”œβ”€ [Each language subcategory with associated JSON files]

Exam Code Descriptions πŸ“œ

Results

Models Average Accuracy (Korean Culture) Average Accuracy (Korean Language)
Polyglot-Ko 1.3B 32.71% 22.88%
Polyglot-Ko 3.8B 32.90% 22.38%
Polyglot-Ko 5.8B 33.14% 23.27%
Polyglot-Ko 12.8B 33.40% 22.24%
KULLM 5.8B 33.79% 23.50%
KULLM 12.8B 33.51% 23.78%
KoAlpaca 5.8B 32.33% 23.87%
KoAlpaca 12.8B 33.80% 22.42%
LLaMA-Ko 7B 33.26% 25.69%
LLaMA 7B 35.44% 27.17%
LLaMA 13B 36.22% 26.71%
GPT-3.5 49.30% 42.32%
Claude2 51.72% 45.39%

Dataset Link πŸ”—

The CLIcK dataset is available on the Hugging Face Hub: CLIcK Dataset

Citation πŸ“

If you use CLIcK in your research, please cite our paper:

@misc{kim2024click,
      title={CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean}, 
      author={Eunsu Kim and Juyoung Suk and Philhoon Oh and Haneul Yoo and James Thorne and Alice Oh},
      year={2024},
      eprint={2403.06412},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Contact πŸ“§

For any questions or inquiries, please contact kes0317@kaist.ac.kr.