iKala / ievals

Official github repo for TMMLU+, Large scale traditional chinese massive multitask language understanding
MIT License
44 stars 2 forks source link

iEvals : iKala's Evaluator for Large Language Models

iEvals is a framework for evaluating chinese large language models (LLMs), especially performance in traditional chinese domain. Our goal was to provide an easy to setup and fast evaluation library for guiding the performance/use on existing chinese LLMs.

Currently, we only support evaluation for TMMLU+, however in the future we are exploring more domain, ie knowledge extensive dataset (CMMLU, C-Eval) as well as context retrieval and multi-conversation dataset.

Installation

pip install git+https://github.com/ikala-corp/ievals.git

Usage

ieval <model name> <series: optional> --top_k <numbers of incontext examples>

For more details please refer to models section

Coming soon

Citation

@article{ikala2023eval,
  title={An Improved Traditional Chinese Evaluation Suite for Foundation Model},
  author={Tam, Zhi-Rui and Pai, Ya-Ting},
  journal={arXiv},
  year={2023}
}

Disclaimer

This is not an officially supported iKala product.

This research code is provided "as-is" to the broader research community. iKala does not promise to maintain or otherwise support this code in any way.