aiverify-foundation / moonshot-data

Contains all assets to run with Moonshot Library (Connectors, Datasets and Metrics)
Apache License 2.0
11 stars 9 forks source link

FlagEval team tries to add the flagjudge model to help judge the clcc dataset. #65

Closed eyuansu62 closed 1 month ago

eyuansu62 commented 1 month ago

Description

[FlagEval team tries to add the flagjudge model to help judge the clcc dataset.]

Motivation and Context

[A judge model provieded by FlagEval team can be used to help judge the output of clcc dataset automaticlly.]

imda-lionelteo commented 1 month ago

Hi there, Thanks for the PR. WIll be looking through it

imda-lionelteo commented 1 month ago

@eyuansu62 Hello, I am encountering some error running this on the cli. can you help me by giving me the command to run it

imda-lionelteo commented 1 month ago

It is working and below are the results

image image
imda-lionelteo commented 1 month ago

@imda-benedictlee for review

eyuansu62 commented 1 month ago

Thank you for all your hard work!