allegroai / clearml-agent

ClearML Agent - ML-Ops made easy. ML-Ops scheduler & orchestration solution
https://clear.ml/docs/
Apache License 2.0
229 stars 89 forks source link

Use agent with dind #199

Open Chelovek760 opened 2 months ago

Chelovek760 commented 2 months ago

Now dind not working

After worker container build raise error

cp: -r not specified; omitting directory '/tmp/clearml.conf'

Dockerfile

FROM docker:20.10.10-dind

WORKDIR /app
RUN apk add --no-cache python3 py3-pip \
    && ln -sf python3 /usr/bin/python

RUN python3 -m venv /venv
ENV PATH="/venv/bin:$PATH"

RUN apk add --no-cache gcc musl-dev python3-dev linux-headers

RUN pip install --upgrade pip && pip install clearml-agent

ENTRYPOINT ["clearml-agent"]

docker compose


version: "3.9"

services:
  clear-ml-agent:
    env_file:
      - .env
    build:
      context: .
    command: daemon --queue default --docker --foreground
    runtime: nvidia
    privileged: true
    volumes:
     - /var/run/docker.sock:/var/run/docker.sock
     - ./config/clearml.conf:/root/clearml.conf

Maybe there is a solution to this problem?

jkhenning commented 2 months ago

This means the volume mount - ./config/clearml.conf:/root/clearml.conf is pointing to a directory, not a file

Chelovek760 commented 2 months ago

@jkhenning could you please suggest the correct way to transfer the configuration?

Chelovek760 commented 3 weeks ago

I found a solution - you need to mount tmp in the agent container

services:
  clear-ml-agent:
    env_file:
      - .env
    build:
      context: .
    command: daemon --queue default --docker --foreground
    runtime: nvidia
    privileged: true
    volumes:
     - /var/run/docker.sock:/var/run/docker.sock
     - ./config/clearml.conf:/root/clearml.conf
     - /tmp:/tmp