Implementation of AI Remote Worker (AI-323)

This pull request introduces the AI remote worker (AI-323) to the AI stack, addressing a component intentionally omitted from the subnet POC to expedite development. This addition aims to accelerate AI job processing on the network, which is crucial for scaling the AI subnet with pools and larger orchestrators. Initial prototyping began in late April, resulting in this tech spec, with core development starting in June. This implementation in this pull erquest was carried out by @ad-astra-video in collaboration with @mikezupper and the Tenderize team for optimal network solutions. We are grateful for this collaboration, and each party will be acknowledged for their contribution as co-authors during the merge ❤️‍🔥.

The implementation builds on the existing remote Transcoder framework. While minor modifications were necessary for AI functionality, the structure remains similar to ensure maintainability and ease of future merging into the main codebase.

[!NOTE]
This branch's commits were squashed and force-pushed multiple times to streamline the review process. Although some commit history was lost, additional history can be found in this pull request or at https://github.com/ad-astra-video/go-livepeer/tree/ai-video-remoteaiworker-pr.

Key points of this implementation:

Establishes a new AI remote worker architecture based on the remote Transcoder framework.
Includes modifications specific to AI functionality while maintaining a similar structure to the remote transcoder.
Future improvements and enhancements will be addressed in subsequent pull requests.

Architecture

The flowchart below outlines the new AI remote worker setup, based on the remote transcoder setup in the regular transcoding codebase. More details are available in the Tech Spec.

How to Test the AI Remote Worker

ezgif-7-74d51104e8

To test the AI remote worker, follow these steps:

Clone the Repositories:

Clone the AI_323_review branch:

git clone -b AI_323_review https://github.com/livepeer/go-livepeer.git

Clone the latest ai-worker main branch:

git clone https://github.com/livepeer/ai-worker.git

Build the Containers:

In the go-livepeer folder:

docker build --build-arg="BUILDTAGS=mainnet" -t livepeer/go-livepeer:remote-worker -f docker/Dockerfile .

In the ai-worker folder:

cd runner && docker build -t livepeer/ai-runner:remote-worker .

Download Test Assets:
- Download and extract remote_worker_test_assets.zip on your machine.
Update docker-compose.yml:
- Update /temp/data/ to point to your dataDir.
- Update /temp/data/model to the path where your AI models are stored. Ensure the models folder has a full path and matches on both sides (see Livepeer documentation).
- Create a separate network to ensure the gateway and orchestrator are on the same network. The AI worker must be set to network_mode: "host" for managed containers to work properly:
```
docker network create livepeer-ai-testing --driver bridge --gateway 12.12.0.1 --subnet 12.12.0.0/24
```
Update Configurations:
- Update the aiModelsDir in the aiworker.conf to the path where your AI models are stored.
- If you want to test payments and run on-chain, ensure you have a wallet JSON with a deposit. Otherwise, update the network to off-chain in gateway.conf and orchestrator.conf.
- If tickets are winners, don't fund the orchestrator and delete the database to reset everything. This way, no tickets will be claimed.
Start the Network Nodes:
- Open four terminals and start the different network nodes:
  - Start the orchestrator:
```
docker compose up livepeer-test-orchestrator
```
  - Start the AI worker:
```
docker compose up livepeer-test-aiworker
```
  - Start the gateway:
```
docker compose up livepeer-test-gateway
```

Send an Inference Request:

Send an inference request to the gateway to see the remote worker in action:

curl -X POST  http://12.12.0.100:6666/text-to-image -d '{
 "model_id": "stabilityai/sd-turbo",
 "prompt": "a small white kitten on a blue hammock and a palm tree at an abstract ethereal semi-transparent sunny beach among rainbow light impressive skies",
 "negative_prompt": "",
 "guidance_scale": 7,
 "width": 1024,
 "height": 1024,
 "num_inference_steps": 6,
 "num_images_per_prompt": 3
}'

Test Real Inference:
- Remove the MOCK_PIPELINE environment variable in the docker-compose.yml and download the stabilityai/sd-turbo model to see the real inference in action.

livepeer / go-livepeer

Implementation of AI Remote Worker (AI-323) #3088

Architecture

How to Test the AI Remote Worker