[Feature]: Support setting lakera guardrail thresholds in `/chat/completion` metadata

BerriAI / litellm

Python SDK, Proxy Server to call 100+ LLM APIs using the OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

https://docs.litellm.ai/docs/

Other

12.2k stars 1.42k forks source link

[Feature]: Support setting lakera guardrail thresholds in `/chat/completion` metadata #5182

Open krrishdholakia opened 1 month ago

krrishdholakia commented 1 month ago

The Feature

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
  "model": "gpt-4o",
  "metadata": {
    "guardrails": {"prompt_injection": .7, "content_moderation": false},
  },
  "messages": [
    {
      "role": "user",
      "content": "Who won the world cup in 2022? "
    }
  ],
  "stream": false
}'

Motivation, pitch

If a team is allowed to enable/disable, they should be allowed to set the threshold.

Twitter / LinkedIn details

@clintz

krrishdholakia commented 1 month ago

not clear how this suggested curl would map to the guardrails config

krrishdholakia commented 1 month ago

the other approach here looks pretty complicated, so it should be simpler imo