mtkresearch / TCEval

1 stars 2 forks source link

TCEval v2

Install

cd lm-evaluation-harness_mr-revised
pip3 install -e ".[vllm]"
pip3 install -U vllm
cd ..

Evaluate Local Models (MMLU, TMMLU+, and Penguin_Table)

please reference examples

Evaluate API Models (MMLU, TMMLU+, and Penguin_Table)

please check scripts/cal_likelihood_by_api.py

Evaluate MTBench-tw

please reference here.