Enable sandboxing for tool execution. Previously, tool execution was being done in the same runtime environment/thread as the agent control loop - this is problematic for obvious reasons. We introduce the ability to run tools in a sandboxed environment:
We support two different kinds of sandboxes currently: E2B and "local".
The E2B sandbox requires an E2B API key, and users can currently configure the template_id and timeout of the box
The local sandbox relies on a local directory, and runs Python code in the context of that directory and the dependencies installed in the current environment
We also add the ability for users to manage sandbox configurations:
Create configurations for each kind of Sandbox (more configurations to come in the future):
class LocalSandboxConfig(BaseModel):
venv_name: str = Field("venv", description="Name of the virtual environment.")
sandbox_dir: str = Field(..., description="Directory for the sandbox environment.")
class E2BSandboxConfig(BaseModel):
timeout: int = Field(5 * 60, description="Time limit for the sandbox (in seconds).")
template_id: Optional[str] = Field(None, description="The E2B template id (docker image).")
- Add environment variables per sandbox configuration
There's also some nifty optimizations around E2B, such as not immediately killing the sandbox so sandboxes can get reused (save on spin-up time), and only refreshing the sandbox when we detect the user has changed either the config or environment variables for that box.
In a separate PR (merged into this one), we also add the ability to modify agent state via tools by serializing the agent state and passing back and forth between the sandbox and running Python thread.
## Testing
- A suite of unit tests that cover the client functionality (both local and REST)
- A suite of live integration tests with E2B covering happy paths + edge cases where the config changed in the middle of execution, differing environments compared to local, etc.
- Manual testing on the dev portal
## Contributions
Thanks to @carenthomas for contributing, this PR is based off her initial PR!
Description
Enable sandboxing for tool execution. Previously, tool execution was being done in the same runtime environment/thread as the agent control loop - this is problematic for obvious reasons. We introduce the ability to run tools in a sandboxed environment:
We also add the ability for users to manage sandbox configurations:
class E2BSandboxConfig(BaseModel): timeout: int = Field(5 * 60, description="Time limit for the sandbox (in seconds).") template_id: Optional[str] = Field(None, description="The E2B template id (docker image).")