skypilot-org / skypilot

SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.
https://skypilot.readthedocs.io
Apache License 2.0
6.22k stars 427 forks source link

Running Docker on RunPod doesn't work #3096

Open okdewit opened 5 months ago

okdewit commented 5 months ago

I used Skypilot docs and Mistral docs to create this YAML:

resources:
  accelerators: RTXA4000:1

run: |
  docker run --gpus all -p 8000:8000 ghcr.io/mistralai/mistral-src/vllm:latest \
                   --host 0.0.0.0 \
                   --model mistralai/Mistral-7B-Instruct-v0.2 \
                   --tensor-parallel-size 1

This spins up runpod/base:0.0.2 docker image, and I think it then tries run docker within that container, which won't work.

Is there a way to deploy docker images on RunPod through Skypilot, or maybe create a RunPod template through SkyPilot?

7flash commented 4 months ago

https://github.com/skypilot-org/skypilot/issues/3007

kodxana commented 2 months ago

@okdewit This is normal as RunPod pods are docker containers. You would need to turn it probably into template.

romilbhardwaj commented 1 month ago

RunPod does not allow running docker daemon in pods, so docker run and image_id: docker:.... through SkyPilot would not work.

As @7flash pointed out, #3007 is tracking support for custom templates, which may help with the image_id: docker:.... use case.