Closed CoderBak closed 7 months ago
python inference.py -m qwen-turbo -d mt_bench --evaluation_set train\[:3\] --model_type base
Result: 8.67
Result: 8.67