langgenius / dify-sandbox

A lightweight, fast, and secure code execution environment that supports multiple programming languages
https://docs.dify.ai/development/backend/sandbox
Apache License 2.0
339 stars 71 forks source link

Discussion: switch the raw python runner with prescript.py implementation to use Jupyter #18

Open dafang opened 1 month ago

dafang commented 1 month ago

The current implementation uses raw Python with exec prescript.py, but there are some issues:

Jupyter, by default, addresses the above issues.

Thoughts:

Open for discussion.

Yeuoly commented 1 month ago

Sounds great, but from what I understand, Jupyter needs a kernel runs background, it's not a temporary process for each code execution request, how do you want to implement it

dafang commented 1 month ago

Yes, quickly went through the ipykernel source code ,and did some tests, but failed with the seccomp calls.

The lib.DifySeccomp(65537, 1001, 1) call will block the kernel, still need some time to figure out and test.

import time

from jupyter_client import KernelManager

# 启动内核
km = KernelManager()
km.start_kernel()
kc = km.client()
kc.start_channels()

# 检查内核是否启动成功
kc.kernel_info()

def execute_code(code):
    msg_id = kc.execute(code)
    result = None
    while True:
        msg = kc.get_iopub_msg()
        if msg["parent_header"].get("msg_id") == msg_id:
            msg_type = msg["header"]["msg_type"]
            content = msg["content"]

            if msg_type == "execute_result":
                result = content["data"]["text/plain"]
                break
            elif msg_type == "stream":
                print(content["text"])
            elif msg_type == "error":
                print("\n".join(content["traceback"]))
                break
    return result

# 示例代码执行
code = """import ctypes
import json
import os
import sys
import traceback

# setup sys.excepthook
def excepthook(type, value, tb):
    sys.stderr.write("".join(traceback.format_exception(type, value, tb)))
    sys.stderr.flush()
    sys.exit(-1)

sys.excepthook = excepthook

lib = ctypes.CDLL("/var/sandbox/sandbox-python/python.so")
lib.DifySeccomp.argtypes = [ctypes.c_uint32, ctypes.c_uint32, ctypes.c_bool]
lib.DifySeccomp.restype = None

os.chdir("/var/sandbox/sandbox-python")

# lib.DifySeccomp(65537, 1001, 1)

# declare main function here
def main() -> dict:
    return {"message": [1, 2, 3]}

from base64 import b64decode
from json import dumps, loads

# execute main function, and return the result
# inputs is a dict, and it
inputs = b64decode("e30=").decode("utf-8")
output = main(**json.loads(inputs))

# convert output to json and print
output = dumps(output, indent=4)

result = f"{output}"

print(result)
result
"""
result = execute_code(code)
print("Result:", result)

# 关闭内核
kc.stop_channels()
km.shutdown_kernel()
Yeuoly commented 1 month ago

btw, how long will it cost for single request? launching a kernel needs some time I guess, I wonder is there a way to make Jupyter hangs and wait for requests, so that it could be faster, but looks like it's a huge work.