Closed pluiez closed 2 months ago
In real-world chat applications, malicious actors could exploit this vulnerability by injecting control tokens disguised within Jinja templates. This could potentially allow them to manipulate the AI assistant's behavior, such as redefining system prompts and disrupting conversation flow.
Not a bug nor security. Tokens do NOT have privileges linked to them. Models will output what they output, and there is not hierarchy in the tokens for the model itself. Inserting tokens for users by means of text, is a feature. Limiting them to a subset of tokens might be desirable, but will never be any kind of substantial defense against prompt injection or jailbreak (just look at how easy to "jailbreak" ANY model.)
This repo also does not have jinja as a dependency, nor does it apply templates.
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
There is a potential security vulnerability in the
apply_chat_template
function within the Tokenizers library. The current implementation, which leverages Jinja templates and processes the conversation as text before tokenization, introduces a risk of control token injection attacks.In real-world chat applications, malicious actors could exploit this vulnerability by injecting control tokens disguised within Jinja templates. This could potentially allow them to manipulate the AI assistant's behavior, such as redefining system prompts and disrupting conversation flow.
Current Approach and Concerns:
apply_chat_template
function first generates text by applying Jinja templates to the conversation history.Suggested Approach:
<|system_prompt|>
, would then be inserted at appropriate positions based on the context after tokenization.