FudanDISC / DISC-LawLLM

[中文法律大模型] DISC-LawLLM: an intelligent legal system powered by large language models (LLMs) to provide a wide range of legal services.
Apache License 2.0
563 stars 66 forks source link

Eval datasets with answer #13

Closed ssbuild closed 1 year ago

ssbuild commented 1 year ago

Could you provide evals datasets with answerrt

Charlie-XIAO commented 1 year ago

We are planning to release the whole DISC-Law-Eval benchmark, including the evaluation framework and our evaluation dataset. It, however, has not been ready for release right now. Thanks for your patience.

xref #1

ssbuild commented 1 year ago

thank you good job. can you release the dev dataset to huggingface, like cevals ,
and you own test dataset

Charlie-XIAO commented 1 year ago

Do you mean the training set? I believe the DISC-Law-SFT dataset is already released (please see README), except the QA part which will not be open sourced.

ssbuild commented 1 year ago

not training dataset , evals datasets
cevals public dev evals datasets , and they own test evals datasets to evaluate llm and ranking. i mean you whether can do same as them.

Charlie-XIAO commented 1 year ago

From what I know, C-Eval releases datasets on HuggingFace with questions but no answers, and allows you to upload your results to be evaluated. Did I miss anything?

Regardless, what we will release is the evaluation dataset with answers and the evaluation scripts, so you do your evaluation locally instead of uploading your results to us.

Charlie-XIAO commented 1 year ago

I’m so sorry but I need to follow the schedule. I promise we will release it as soon as it is done.