janhq / cortex.cpp

Local AI API Platform
https://cortex.so
Apache License 2.0
2.14k stars 127 forks source link

epic: Cortex.cpp to support Python? #1353

Open dan-homebrew opened 1 month ago

dan-homebrew commented 1 month ago

Goal

Tasklist

Previous Discussions

nguyenhoangthuan99 commented 1 month ago

cortex.python Integration Architecture

Overview

This document outlines the architecture for integrating Python functionality into a C++ application, specifically for running machine learning models. The system uses a proxy approach to connect the C++ application (cortex.cpp) with Python processes, allowing for isolated environments for different models.

Architecture Diagram

Image

Key Components

  1. cortex.cpp: The main C++ application.
  2. cortex.python: A proxy engine that connects cortex.cpp with Python processes.
  3. Python Processes: Separate processes spawned for each model execution.
  4. Virtual Environments: Isolated Python environments for each model.

Folder Structure

cortexcpp/
├── models/
│           cortexso/
│           └── python/
│               └── whisper/
│                     ├── model-binary.pth 
│                     ├── whisper.py
│                     ├── main.py
│                     └── requirements.txt       
├── engines/
│   ├── cortex.llamacpp/
│   └── cortex.python/
│       ├── libengine.so  # proxy interface for python model and cortex.cpp
│       └── venv/         # Virtual environments
│           ├── whisper/
│           │   ├── lib/     #  python libraries and dependencies for whisper
│           │   └── bin/
│           │           └─ python3.12 # executable python for whisper
│           ├── fish-speech/
│           └── vision/

Processes

Model Pulling

  1. Request from cortex.cpp to cortex.python
  2. create virtual environment
  3. Pull python for created virtual environment.
  4. Pull code, model from cortexso.
  5. Install dependencies to virtual environment: /path/to/venv/bin/python -m pip install -r requirements.txt

The model pulling step also needs to install the engine for running python model. engine or backend for python model is all libs and deps inside virtual environment.

Model Execution

  1. Request from cortex.cpp to cortex.python
  2. cortex.python spawns a new process
  3. Run main.py in the appropriate virtual environment/ engine/ backend.

Chat Functionality

  1. Request from cortex.cpp to cortex.python
  2. cortex.python communicates with the Python process via WebSocket, Unix Domain Socket, or similar

Implementation Details

Python Interface

Virtual Environments

Packaged Python

Model Execution

dan-homebrew commented 1 month ago

@nguyenhoangthuan99 @vansangpfiev @namchuai I would like to raise a concern here, and propose a (possibly incorrect) alternative:

Engines as 1st class Citizens of Cortex

This has the following benefits:

How this would work

prabirshrestha commented 1 week ago

I would suggest using uv. It is extremely fast Python package and project manager, written in Rust.

Then you can even do something like this.

uv run --with mlx-lm \
  mlx_lm.generate \
  --model mlx-community/Qwen2.5-Coder-32B-Instruct-8bit \
  --max-tokens 4000 \
  --prompt 'write me a python function that renders a mandelbrot fractal as wide as the current terminal'

There are lot of python projects related to LLM, so being able to use those packages directly can easily help.

These might be of interest: