kaistAI / FLASK

[ICLR 2024 Spotlight] FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets
https://arxiv.org/abs/2307.10928
211 stars 18 forks source link

Evaluation Code with Prometheus #3

Open Haoxiang-Wang opened 12 months ago

Haoxiang-Wang commented 12 months ago

I am impressed by this FLASK project and your follow-up work, Prometheus. In the Prometheus paper, I saw experiments conducted on FLASK. Can you release your code for evaluating with Prometheus for FLASK? This can help researchers reduce FLASK evaluation costs significantly by switching the evaluator from GPT-4 to Prometheus.