Open reinbugnot opened 3 weeks ago
Thanks for your support to our work. S-Eval has a detailed record of the detailed risk types (102 detailed risk subcategories in four levels) to which each prompt belongs. Currently, we have only made available the labels for the first-level risk dimensions. We will be releasing more fine-grained risk labels in the near future. Please stay tuned, and we will notify you as soon as they are available.
If our work is useful for your research, please star ⭐ our project.
Hi, I'm from NUS-NCS Cybersecurity Laboratory in SG.
I am interested in using the S-Eval dataset in our LLM risk evaluations. From the README.md file, there's a breakdown of how many prompts are available per Risk Category (i.e. Access Control, Hacker Attack, Malicious Code, etc. under Cybersecurity).
But the risk category information is currently not included in the .jsonl files inside
s_eval/
.Can I ask if there's a way to group the prompts according to their risk categories? This will greatly help us in our use case. Thanks!