defenseunicorns / leapfrogai

Production-ready Generative AI for local, cloud native, airgap, and edge deployments.
https://leapfrog.ai
Apache License 2.0
242 stars 25 forks source link
ai kubernetes llm machine-learning

LeapfrogAI

OpenSSF Scorecard

Table of Contents

Overview

LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped environments. This project aims to bring sophisticated AI solutions to air-gapped resource-constrained environments, by enabling the hosting all requisite components of an AI stack.

Our services include vector databases, model backends, API, and UI. These capabilities can be easily accessed and integrated with your existing infrastructure, ensuring the power of AI can be harnessed irrespective of your environment's limitations.

Why Host Your Own LLM?

Large Language Models (LLMs) are a powerful resource for AI-driven decision making, content generation, and more. How can LeapfrogAI bring AI to your mission?

Structure

The LeapfrogAI repository follows a monorepo structure based around an API with each of the components included in a dedicated packages directory. Each of these package directories contains the source code for each component as well as the deployment infrastructure. The UDS bundles that handle the development and latest deployments of LeapfrogAI are in the uds-bundles directory. The structure looks as follows:

leapfrogai/
├── src/
│   ├── leapfrogai_api/
│   │   ├── main.py
│   │   └── ...
│   └── leapfrogai_sdk/
├── packages/
│   ├── api/
│   ├── llama-cpp-python/
│   ├── text-embeddings/
│   ├── vllm/
│   └── whisper/
├── uds-bundles/
│   ├── dev/
│   └── latest/
├── Makefile
├── pyproject.toml
├── README.md
└── ...

Getting Started

The preferred method for running LeapfrogAI is a local Kubernetes deployment using UDS. Refer to the Quick Start section of the LeapfrogAI documentation site for instructions on this type of deployment.

Components

API

LeapfrogAI provides an API that closely matches that of OpenAI's. This feature allows tools that have been built with OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend.

Backends

Available Backends: Backend AMD64 Support ARM64 Support Cuda Support Docker Ready K8s Ready Zarf Ready
llama-cpp-python 🚧
whisper 🚧
text-embeddings 🚧
vllm
rag (repo integration soon)

LeapfrogAI provides several backends for a variety of use cases.

Image Hardening

GitHub Repo:

LeapfrogAI leverages Chainguard's apko to harden base python images - pinning Python versions to the latest supported version by the other components of the LeapfrogAI stack.

SDK

The LeapfrogAI SDK provides a standard set of protobuff and python utilities for implementing backends and gRPC.

User Interface

LeapfrogAI provides a User Interface with support for common use-cases such as chat, summarization, and transcription.

Repeater

The repeater "model" is a basic "backend" that parrots all inputs it receives back to the user. It is built out the same way all the actual backends are and it primarily used for testing the API.

Usage

UDS (Latest)

LeapfrogAI can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. This pulls the most recent package images and is the most stable way of running a local LeapfrogAI deployment. These instructions can be found on the LeapfrogAI Docs site.

UDS (Dev)

If you want to make some changes to LeapfrogAI before deploying via UDS (for example in a dev environment), you can follow these instructions:

Make sure your system has the required dependencies.

For ease, it's best to create a virtual environment:

python -m venv .venv
source .venv/bin/activate

Each component is built into its own Zarf package. You can build all of the packages you need at once with the following Make targets:

make build-cpu    # api, llama-cpp-python, text-embeddings, whisper
make build-gpu    # api, vllm, text-embeddings, whisper
make build-all    # all of the backends

OR

You can build components individually using the following Make targets:

make build-api
make build-vllm                 # if you have GPUs
make build-llama-cpp-python     # if you have CPU only
make build-text-embeddings
make build-whisper

Once the packages are created, you can deploy either a CPU or GPU-enabled deployment via one of the UDS bundles:

CPU

cd uds-bundles/dev/cpu
uds create .
uds deploy k3d-core-slim-dev:0.22.2
uds deploy uds-bundle-leapfrogai*.tar.zst

GPU

cd uds-bundles/dev/gpu
uds create .
uds deploy k3d-core-slim-dev:0.22.2 --set K3D_EXTRA_ARGS="--gpus=all --image=ghcr.io/justinthelaw/k3d-gpu-support:v1.27.4-k3s1-cuda"     # be sure to check if a newer version exists
uds deploy uds-bundle-leapfrogai-*.tar.zst --confirm

Accessing the UI

LeapfrogAI is integrated with the UDS Core KeyCloak service, which provides authentication via SSO. Below are general instructions for accessing the LeapfrogAI UI after a successful UDS deployment of UDS Core and LeapfrogAI.

  1. Connect to the KeyCloak admin panel a. Run the following to get a port-forwarded tunnel: uds zarf connect keycloak b. Go to the resulting localhost URL and create an admin account
  2. Go to ai.uds.dev and press "Login using SSO"
  3. Register a new user by pressing "Register Here"
  4. Fill-in all of the information a. The bot detection requires you to scroll and click around in a natural way, so if the Register button is not activated despite correct information, try moving around the page until the bot detection says 100% verified
  5. Using an authenticator, follow the MFA steps
  6. Go to sso.uds.dev a. Login using the admin account you created earlier
  7. Approve the newly registered user a. Click on the hamburger menu in the top left to open/close the sidebar b. Go to the dropdown that likely says "Keycloak" and switch to the "uds" context c. Click "Users" in the sidebar d. Click on the newly registered user's username e. Go to the "Email Verified" switch and toggle it to be "Yes" f. Scroll to the bottom and press "Save"
  8. Go back to ai.uds.dev and login as the registered user to access the UI

Clean-up

To clean-up or perform a fresh install, run the following commands in the context in which you had previously installed UDS Core and LeapfrogAI:

k3d cluster delete uds  # kills a running uds cluster
uds zarf tools clear-cache # clears the Zarf tool cache
rm -rf ~/.uds-cache # clears the UDS cache
docker system prune -a -f # removes all hanging containers and images
docker volume prune -f # removes all hanging container volumes

Local Dev

The following instructions are for running each of the LFAI components for local development. This is useful when testing changes to a specific component, but will not assist in a full deployment of LeapfrogAI. Please refer to the above sections for deployment instructions.

It is highly recommended to make a virtual environment to keep the development environment clean:

python -m venv .venv
source .venv/bin/activate

API

To run the LeapfrogAI API locally (starting from the root directory of the repository):

python -m pip install src/leapfrogai_sdk
cd src/leapfrogai_api
python -m pip install .
uvicorn leapfrogai_api.main:app --port 3000 --log-level debug --reload

Repeater

The instructions for running the basic repeater model (used for testing the API) can be found in the package README.

Backend: llama-cpp-python

To run the llama-cpp-python backend locally (starting from the root directory of the repository):

python -m pip install src/leapfrogai_sdk
cd packages/llama-cpp-python
python -m pip install .[dev]
python scripts/model_download.py
mv .model/*.gguf .model/model.gguf
cp config.example.yaml config.yaml # Make any necessary updates
lfai-cli --app-dir=. main:Model

Backend: text-embeddings

To run the text-embeddings backend locally (starting from the root directory of the repository):

python -m pip install src/leapfrogai_sdk
cd packages/text-embeddings
python -m pip install .[dev]
python scripts/model_download.py
python -u main.py

Backend: vllm

To run the vllm backend locally (starting from the root directory of the repository):

python -m pip install src/leapfrogai_sdk
cd packages/vllm
python -m pip install .[dev]
python packages/vllm/src/model_download.py
export QUANTIZATION=gptq
python -u src/main.py

Backend: whisper

To run the vllm backend locally (starting from the root directory of the repository):

python -m pip install src/leapfrogai_sdk
cd packages/whisper
python -m pip install ".[dev]"
ct2-transformers-converter --model openai/whisper-base --output_dir .model --copy_files tokenizer.json --quantization float32
python -u main.py

Community

LeapfrogAI is supported by a community of users and contributors, including:

Defense Unicorns logoBeast Code logoHypergiant logoPulze logo

Want to add your organization or logo to this list? Open a PR!