Add benchmarks and evals for jailbreaks

hegelai / prompttools

Open-source tools for prompt testing and experimentation, with support for both LLMs (e.g. OpenAI, LLaMA) and vector databases (e.g. Chroma, Weaviate, LanceDB).

http://prompttools.readthedocs.io

Apache License 2.0

2.65k stars 230 forks source link

Add benchmarks and evals for jailbreaks #51

Open steventkrawczyk opened 1 year ago

steventkrawczyk commented 1 year ago

🚀 The feature

As we add benchmarks, it would be good to cover common jailbreak scenarios. We should incorporate these benchmarks, and have auto-evals that check responses to see if they are "broken"

Motivation, pitch

https://github.com/llm-attacks/llm-attacks

Alternatives

No response

Additional context

No response