dstackai / dstack

dstack is a lightweight, open-source alternative to Kubernetes & Slurm, simplifying AI container orchestration with multi-cloud & on-prem support. It natively supports NVIDIA, AMD, & TPU.
https://dstack.ai/docs
Mozilla Public License 2.0
1.58k stars 155 forks source link

[Bug]: `dstack apply` won't work inside `dstackai/dstack` image #2019

Closed un-def closed 21 hours ago

un-def commented 4 days ago

Steps to reproduce

Start a dstackai/dstack container:

# docker run --rm -it --entrypoint bash dstackai/dstack:0.18.26

Inside the container:

# cd ~ && mkdir demo && cd demo

# dstack config --url http://172.20.0.1:3000 --token <TOKEN> --project main

# dstack init
OK

# echo '
type: task
commands:
  - nvidia-smi
  - nvidia-smi dmon -d 1
resources:
  gpu: nvidia:1
' > .dstack.yml

# dstack apply -y
 Project                main
 User                   admin
 Configuration          .dstack.yml
 Type                   task
 Resources              2..xCPU, 1GB.., 1xGPU, 0GB.. (disk)
 Max price              -
 Max duration           72h
 Spot policy            on-demand
 Retry policy           no
 Creation policy        reuse-or-create
 Termination policy     destroy-after-idle
 Termination idle time  5m

 #  BACKEND  REGION  INSTANCE  RESOURCES                                       SPOT  PRICE
 1  ssh      remote  instance  32xCPU, 31GB, 1xRTX3070Ti (8GB), 32.8GB (disk)  no    $0     idle

rotten-otter-1 provisioning completed (running)
Traceback (most recent call last):
  File "/usr/local/bin/dstack", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/usr/local/lib/python3.11/site-packages/dstack/_internal/cli/main.py", line 81, in main
    args.func(args)
  File "/usr/local/lib/python3.11/site-packages/dstack/_internal/cli/commands/apply.py", line 77, in _command
    configurator.apply_configuration(
  File "/usr/local/lib/python3.11/site-packages/dstack/_internal/cli/services/configurators/run.py", line 198, in apply_configuration
    if run.attach(bind_address=bind_address):
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/dstack/api/_public/runs.py", line 283, in attach
    ports_lock = SSHAttach.reuse_ports_lock(run_name=name)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/dstack/_internal/core/services/ssh/attach.py", line 50, in reuse_ports_lock
    ps = subprocess.Popen(("ps", "-A", "-o", "command"), stdout=subprocess.PIPE)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/subprocess.py", line 1026, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/local/lib/python3.11/subprocess.py", line 1955, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'ps'

Actual behaviour

No response

Expected behaviour

No response

dstack version

any

Server logs

No response

Additional information

dstackai/dstack is based on python:3.x-slim, which is based on debian:bookworm, which, in turn, doesn't have ps executable installed (it's provided by the procps package, which is not installed).

Consider replacing ps | grep (and its Windows PowerShell-based equivalent) with some cross-platform Python library which doesn't rely on external libs/binaries (e.g., on Linux procfs can be read directly).