usail-hkust / JailTrickBench

Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs. Empirical tricks for LLM Jailbreaking. (NeurIPS 2024)
https://arxiv.org/abs/2406.09324
MIT License
89 stars 8 forks source link

Can the jailbreak results be made public? #4

Closed yyytwo closed 1 week ago

zhaoxu98 commented 1 week ago

Hi, Thank you for your interest in our work and for raising this issue!

We do plan to make the relevant jailbreak results public. Currently, we are assessing the compliance for releasing these results. Additionally, some of the results are used in our recent work, and the related annotated datasets are publicly available here:

Feel free to check out some examples or use the dataset if needed!