deadbits / vigil-llm

⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
https://vigil.deadbits.ai/
Apache License 2.0
270 stars 32 forks source link

Create example detection workflows #53

Open deadbits opened 7 months ago

deadbits commented 7 months ago

Now that Vigil can be imported as a library, it is a lot easier to call the different scanners and interact with canary tokens. This allows users to define custom detection workflows in Python.

I want to create a few cookbooks that demonstrate different possibilities here.

For example: *Canary token workflow

  1. Add canary token to system prompt template
  2. Receive user input to prompt and combine with canary prompt
  3. Send canary prompt to LLM and receive response
  4. Check LLM response for canary token presence
  5. If detection then add user prompt to vector database for future detections