Extracting prompts based on Risk Category

IS2Lab / S-Eval

S-Eval: Automatic and Adaptive Test Generation for Benchmarking Safety Evaluation of Large Language Models

Other

42 stars 3 forks source link

Hi, I'm from NUS-NCS Cybersecurity Laboratory in SG.

I am interested in using the S-Eval dataset in our LLM risk evaluations. From the README.md file, there's a breakdown of how many prompts are available per Risk Category (i.e. Access Control, Hacker Attack, Malicious Code, etc. under Cybersecurity).

But the risk category information is currently not included in the .jsonl files inside s_eval/.

Can I ask if there's a way to group the prompts according to their risk categories? This will greatly help us in our use case. Thanks!

IS2Lab / S-Eval

Extracting prompts based on Risk Category #4