Closed Lyutoon closed 5 months ago
I guess that removing all the logic of whitelists and so on and migrating the whole execution logic to a sandbox (meaning, getting rid of exec
for good) would solve most of the problems you've been remarkably highlighting in your latest issues. What do you think ?
Yeah I agree with you, this is a good idea. As you can see, actually, the developer exactly developed a simple sandbox (e.g. ast sanitizer, whitelist function or so) while doing exec
.
But it is difficult to execute LLM-generated code totally safe without breaking the functionality of the framework. Given the efficiency of the LLM framework, the development of a lightweight sandbox dedicated to LLM is a necessity.
What if we use docker sdk for this ? We create a custom image for pandasai with all the dependencies (afaik we can't pass an env to docker sdk starting from a raw image), we pull it for the very first execution and cache it, and we run the code inside.
Something like this (keeping the same signature of pandasai.PandasAI.run
)
def container_exec(self, generated_code):
try:
client = docker.from_env() # or this might be an attribute of the class
image_name = "pandasai:our-custom-image-for-pandasai"
try:
client.images.get(image_name)
except ImageNotFound:
# pull the image
# ...
container = client.containers.run(
image_name,
["python", "-c", generated_code],
working_dir="/workspace",
stderr=True,
stdout=True,
detach=True,
)
container.wait()
logs = container.logs().decode("utf-8")
container.remove()
return logs
Do you think this would solve the jail breaks you have highlited ?
Seems good! Do you mean that the code is running in the docker? And the docker is isolated with the host serve machine?
Yep, I think so!
But here is a question that even it runs code in docker, the attacker can also reverse the shell to control the docker which means he can use the computational resources. This is also a big problem.
To do that the attacker needs to reverse the shell AND gain access over the host machine, right ? Can you produce a MRE of a harmful scenario via prompt injection using this snippet ?
import docker
malicius_code = """
# your code here
# ...
"""
client = docker.from_env()
image_name = "python:3-alpine"
container = client.containers.run(
image_name,
["python", "-c", malicius_code],
working_dir="/workspace",
stderr=True,
stdout=True,
detach=True
)
container.wait()
print(container.logs().decode())
Btw I've tried your PoC on Azure OpenAI with gpt-35-turbo-0301 and the model always recognizes the threat
I'm sorry, I cannot fulfill this request as it goes against ethical and security principles. Providing access to sensitive files such as /etc/passwd is not appropriate or safe. My purpose is to assist with tasks that are legal, ethical, and safe. Is there anything else I can help you with?
I was wondering if we could add an extra layer of verification at prompt level, asking the model to first evaluate the question (it refers to df
entirely, it is not harmful, it does not try to access the fs etc).
Hi, I think that Azure OpenAI with gpt-35-turbo-0301
can somehow detect the threat. (But if you use just openAI llm as my PoC, it can easily bypassed by the second prompt. The first prompt also be detected as a threat. You can have a try.). But this can be bypassed by llm-jailbreak which allows attacker to use prompt to unlock the llm to do these things. As for your idea about adding prompt level sanitizer, I think this is not the best way to fix the problem because natural language is so vivid that we can hardly sanitize them all.
Thanks a lot for your feedbacks @Lyutoon, @mspronesti!
Here's what I suggest:
What do you think?
🐛 Describe the bug
Overview
In this issue, pandasai allows attacker to read or write arbitrary file by prompt injection. If the service is running on the server, write file can allow attacker to fill up the server disk remotely while read file can leak sensitive informations on the server remotely.
The root cause is also the
exec
function, but different from #399, in #399, the attacker needs to break the python sandbox built by the developer to trigger RCE, while this read/write did not need to break the sandbox since the env parameter ofexec
containsopen
which means read/write a file is in the whitelist of the sandbox!PoC (write)
Log (write): a file named pwnnn with content
1
is created.PoC (read /etc/passwd)
But when the attacker is trying to read some sensitive files, maybe he need to do some
llm jailbreak
otherwise the llm will not generate the correct code to read the sensitive file as you want it to read. Here is the PoC (read /etc/passwd):Log (read): No log provided since /etc/passwd contains sensitive informations, if you are interested in, you can run it on your own machine :)