deadbits / vigil-llm

⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
https://vigil.deadbits.ai/
Apache License 2.0
304 stars 35 forks source link

LLM Guard model #66

Closed deadbits closed 9 months ago

deadbits commented 10 months ago

Check out LLM Guard's new model https://huggingface.co/laiyer/deberta-v3-base-prompt-injection

The current model deepset/deberta-v3-base-injection is very prone for false positives. LLM Guard developed a new model that "that greatly outperforms the previous state-of-the-art alternatives in the market".

If the results are significantly better, I'll replace the model entirely. Otherwise, maybe Vigil could run both models or let the user decide which to use.

deadbits commented 9 months ago

Merged via https://github.com/deadbits/vigil-llm/pull/74