feat: code block execution

zabirauf commented 5 months ago

Is your feature request related to a problem? Please describe. Executing code allow the tool to be used as an ideation and copilot in software development for brainstorming purposes. It also allows user to instead of asking question to LLM and hopping for the correct answer, to actually create code and run it to get correct answer.

Eventually having something like code interpreter as shared in #851 will be great but I think that may require much more changes to get it right e.g. some form of function calling pattern that works across multiple models to pick the code interpreter and some form of chaining to respond based on those results.

This can be a good stepping stone to eventually build that.

Describe the solution you'd like Have a 'Run Code' in the Code block which sends the code over to a new API in backend e.g. /code/api/v1/run, which executes that code in a somewhat secure manner and responds back with response. To start with we can support only Python code.

Additional context

Here is very basic prototype I whipped up: https://github.com/zabirauf/open-webui/pull/1/files Want to get thoughts before I work on it further.

2024-03-26_23-42-47

slash-proc commented 5 months ago

nice work, this looks very promising! here's my thought:

Looking into your implementation, the only thing I don't feel comfortable with is running the code locally without any kind of isolation with the host system. It's conceivable that any user could have code be written or just paste code themselves into chat, it's run in python through subprocess and the host system is compromised. What I'm about to propose isn't necessarily a simple change so I humbly request that you add a way to disable code execution in the admin panel.

I believe it would be best to use containers for code execution. They can be tracked, isolated and then destroyed accordingly. Considering how open-webui can be started locally, in docker or kubernetes, what I'm proposing has interesting implications about how this feature would need to be implemented.

Local - could just use docker python bindings - no fuss Docker - dind + docker python bindings should work Podman/etc. - ? Kubernetes - Service account + kubernetes python bindings to create/destroy pods should work

zabirauf commented 5 months ago

Agreed on the need to provide the code isolation from host. I also am not comfortable with the current direct execution :). I was already leaning towards docker as it helps both from perspective of isolation from host and also for future support for other languages. I'll need to dig a bit more on the dind and see the pros/cons.

Good idea on also having an admin toggle to disable it. Will also track that.

tjbck commented 5 months ago

Love this suggestion! I'll take a look and see how this could be safely implemented soon!

zabirauf commented 5 months ago

@tjbck We can leverage Piston https://github.com/engineer-man/piston for running code in an isolated manner. It's already used in various projects, supports multiple languages, has a straight forward API.

I updated the prototype to use that (main code here). How I anticipate the complete solution can be done is

With the Open-WebUI docker, also run the Piston docker where its API is hosted
Make sure the open-webui docker can access the piston API running in its own docker container
Have a mapping of supported languages and corresponding version
When user tries to execute code block, detect language and make sure correct runtime is installed
Piston takes care of executing and we returning back its results back to open-webui

shinohara-rin commented 5 months ago

How about supporting sandboxed executions on cloud providers like Modal?

This way we can put security concerns away and work on the actual code execution feature first, then implement the sandbox locally later.

Taehui commented 5 months ago

Next to python, it would be nice to support SQL. I think it will be very useful to many people.

gruckion commented 4 months ago

Nice work, I am compiling together research notes on this discussion item and further defining the roadmap for this epic.

https://github.com/open-webui/open-webui/discussions/1629

gruckion commented 4 months ago

Next to python, it would be nice to support SQL. I think it will be very useful to many people.

Please provide more information on how you would like / expect this to function.

Also what limitations can you think of?

YangQiuEric commented 4 months ago

really need this feature lol

tjbck commented 3 months ago

Partially implemented with 0.1.125.

tizkovatereza commented 3 months ago

Hello, have you tried adding E2B for code execution?

It runs code in isolated sandboxes (full VM environment) in the cloud, it's open-source and has also special SDK for code interpreter use cases.

It also supports any LLM and here are some open-source examples of how it is used: https://github.com/e2b-dev/e2b-cookbook.

EDIT: Disclaimer, I'm from E2B team! 😃

bannert1337 commented 3 months ago

Hello, have you tried adding E2B for code execution?

It runs code in isolated sandboxes (full VM environment) in the cloud, it's open-source and has also special SDK for code interpreter use cases.

It also supports any LLM and here are some open-source examples of how it is used: https://github.com/e2b-dev/e2b-cookbook.

Setting the infrastructure up yourself for self-hosting is currently not possible. They use Terraform to deploy the infrastructure and according to their documentation, "right now it is deployable on GCP only". (1)

Therefore, I tend towards the solution proposed by @zabirauf in this comment.

Open WebUI — it should be self-hostable by everyone, without external dependencies.

cyrpaut commented 3 months ago

I just tested the last version and it is very promising. Including the possibility to plot graphs! Wonderful!

I have a question though. And it may be difficult from the point of view of architecture. But I'd love the capacity to interact with the uploaded file.

Let me exemplify, I would like to upload a CSV file and prompt python to read it in panda, manipulate it and plot it. Ultimately, I would love open-webui to be able to act as the data-science feature of GPT4 while keeping my data private.

Is that doable in a forseable future?

Thanks for the job already done.

justinh-rahb commented 3 months ago

@cyrpaut Pipelines is the solution for that, write a "Code Interpreter" pipeline: https://github.com/open-webui/pipelines

tizkovatereza commented 3 months ago

I just tested the last version and it is very promising. Including the possibility to plot graphs! Wonderful!

I have a question though. And it may be difficult from the point of view of architecture. But I'd love the capacity to interact with the uploaded file.

Let me exemplify, I would like to upload a CSV file and prompt python to read it in panda, manipulate it and plot it. Ultimately, I would love open-webui to be able to act as the data-science feature of GPT4 while keeping my data private.

Is that doable in a forseable future?

Thanks for the job already done.

Hey @cyrpaut, thank you! Happy to hear that.

You can definitely interact with uploaded files.

Here is an example with data upload where the agent uses the E2B code interpreter to analyze the uploaded csv file.

If you want to try something that has a web UI, E2B is integrated in LlamaIndex as a tool, so you can just check this template and follow the installation steps in the readme to try it: https://github.com/run-llama/create-llama

To see E2B integrated in a proper enterprise-level product with UI, some tools on the top of my mind are Athena, Gumloop or tinybio.

Is this what you asked for? Hope I helped.

T.

justinh-rahb commented 3 months ago

Let's see a Pipelines x E2B integration then @tizkovatereza 🤘

EtiennePerot commented 1 week ago

I have implemented an Open WebUI function for Python and Bash code block execution. It uses gVisor for sandboxing.

Code execution function

You can install it here.

sultanjulyan commented 1 week ago

I have implemented an Open WebUI function for Python and Bash code block execution. It uses gVisor for sandboxing.

You can install it here.

So, I created a code with a function to export to a file from an AI response. After running the code and getting a successful response, I can't find or locate the file. Where can I find the file that has been successfully saved? @EtiennePerot

EtiennePerot commented 1 week ago

@sultanjulyan In general, for discussion about this tool, please open issues on the tool's repository rather than in this bug.

But to answer your question: The code runs in a sandbox, so all traces of its execution are gone as soon as the code finishes running. So this file no longer exists anywhere. However, it would be quite cool if code that produces files would let you download these files straight from the chat UI. That seems like a good feature request, worth filing an issue about. I filed it here.

open-webui / open-webui

feat: code block execution #1321