ist-dresden / composum-AI

Artificial intelligence services for the Composum Pages CMS and Adobe AEM : content creation and analysis, translation, suggestions, ...
http://ai.composum.com/
MIT License
3 stars 1 forks source link

Use "put words into the AI's mouth" pattern to make AI service more resistant to prompt injection #8

Closed stoerr closed 1 year ago

stoerr commented 1 year ago

One of the troublesome and difficult problems with large language models is the prompt injection problem: if a text is to be processed, there is currently no completely reliable way to make sure the AI doesn't treat parts of the text as instructions, if they are worded that way. This is especially troublesome if the text comes from a third person, e.g. when processing customer emails.

We employ an interesting technique I came up with recently, which constructs the chat sent to ChatGPT so that it looks like ChatGPT retrieved the text - which makes it less likely to follow any instructions contained in there since, after all, those aren't the user's instructions, but something it said by itself, right?

Another idea I had recently is also implemented when appropriate: if an example contains instructions that are ignored as requested, then that makes it also a bit more resistant. (Somewhat like "innoculation").

The following keyword generation template for instance shows 4 techniques to reduce prompt injection:

stoerr commented 1 year ago

Was done with https://github.com/ist-dresden/composum-AI/pull/7 .