thu-coai / AutoDetect

Official github repo for AutoDetect, an automated weakness detection framework for LLMs.
MIT License
37 stars 1 forks source link

get_gpt4_score function in math domain #1

Closed 18907305772 closed 3 months ago

18907305772 commented 3 months ago

Hello, I find that in instruction-following and coding domain, get_gpt4_score function use gpt4_turbo_generate for scoring, but in math domain it use gpt4_generate. What is the reason for this setup?

chengjl19 commented 3 months ago

Because we found that gpt4 scored clearly better than 4 turbo on the math task, and there was little difference on the IF and coding tasks. So we used 4 turbo in order to save cost.