instructor-ai / instructor

structured outputs for llms
https://python.useinstructor.com/
MIT License
8.07k stars 643 forks source link

[SECURITY ISSUE] Breaking change by adding jinja templates also introduces code execution #1132

Open palako opened 1 week ago

palako commented 1 week ago

We've been pined to an old version of instruction due to continuous breaking bugs being introduced. Yesterday we tried to run with the latest to see if things were any better and to our surprise things broke quite severely. It seems that version 1.5 switched to using jinja templates by default, meaning, all of our prompts are now treated as jinja templates.

Our prompts were already coming from jinja templates, plus, for our application, we often include file contents in prompts, some of which are jinja, and we use the cookiecutter library, that uses jinja variables as filenames, so even adding filenames to a prompt now produces a failure.

As per the above, things are broken beyond any chance of using it, so we need to go and pin back to something prior to 1.5. There doesn't seem to be an option to not use the new templating functionality.

Besides the broken behaviour, this now means that anything that makes it into a template is being evaluated and rendered as jinja, which allows python code as part of its syntax. Due to the lack of user input sanitation, this is now a severe security problem allowing for code execution for any one passing user input to instructor. This is even more severe considering that this change has been introduced transparently, so applications using instructor were not expected to be making things jinja safe and they haven't been told to start doing so either.

The request is for the jinja functionality to be either completely removed until implemented properly, or at the very least disable it and only enable it when explicitly requested, the API for template should be optional and completely separated from the way messages where parsed before.

gokturkDev commented 1 day ago

@jxnl
if this is truly the case please add prominent disclaimers to the repo and docs!

ivanleomk commented 5 hours ago

@palako could you show an example of this? Just trying to understand what an exploit might look like.