allegroai / clearml

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
https://clear.ml/docs
Apache License 2.0
5.7k stars 657 forks source link

[ERROR] [CLEARML.auth] Error getting token - FileServer #1308

Closed ito-innovatrics closed 3 months ago

ito-innovatrics commented 3 months ago

Describe the bug

Unable to access DEBUG SAMBLES .jpegs There is error: 401 Unauthorized In docker-fileserver: [9] [ERROR] [CLEARML.auth] Error getting token

To reproduce

Open .jpeg in DEBUG SAMPLE

Expected behaviour

Open the file.

Environment

ClearML 1.16.1 upgraded from 1.9.0-293 Docker

[ Screenshot 2024-08-06 at 10 46 59 AM ]

cedricve commented 2 months ago

Same issue what was the solution? I can see the files on the fileserver /mnt directory, but cannot access them through the UI.

Screenshot 2024-08-29 at 12 50 51 Screenshot 2024-08-29 at 12 50 30
jaffe-fly commented 3 weeks ago

also get this error, only PLOTS no images show, browsers get below error image and the fileserver docker logs is

clearml-fileserver  | [2024-10-30 02:38:52,198] [8] [ERROR] [CLEARML.auth] Error getting token
clearml-fileserver  | [2024-10-30 02:39:51,852] [8] [ERROR] [CLEARML.auth] Error getting token
clearml-fileserver  | [2024-10-30 02:40:43,548] [8] [ERROR] [CLEARML.auth] Error getting token
clearml-fileserver  | [2024-10-30 02:40:48,258] [8] [ERROR] [CLEARML.auth] Error getting token
clearml-fileserver  | [2024-10-30 02:48:12,519] [8] [ERROR] [CLEARML.auth] Error getting token
clearml-fileserver  | [2024-10-30 02:48:18,390] [8] [ERROR] [CLEARML.auth] Error getting token

deloy clearml in docker compose , before start docker compsoe

export CLEARML_AGENT_ACCESS_KEY=xxxxxx
export CLEARML_AGENT_SECRET_KEY=xxxxxx
export CLEARML_HOST_IP=192.168.3.xxx
export CLEARML_AGENT_GIT_USER=xxxxxx
export CLEARML_AGENT_GIT_PASS=xxxxxx
export CLEARML_API_ACCESS_KEY=xxxxxx
export CLEARML_API_SECRET_KEY=xxxxxx

and the docker compose yaml is

version: "3.6"

x-resources: &default-resources
  limits:
    memory: 8G
  reservations:
    memory: 4G

services:

  apiserver:
    command:
    - apiserver
    container_name: clearml-apiserver
    image: dockerpull.com/allegroai/clearml:latest
    restart: unless-stopped
    volumes:
    - /opt/clearml/logs:/var/log/clearml
    - /opt/clearml/config:/opt/clearml/config
    - /opt/clearml/data/fileserver:/mnt/fileserver
    depends_on:
      - redis
      - mongo
      - elasticsearch
      - fileserver
    environment:
      CLEARML_ELASTIC_SERVICE_HOST: elasticsearch
      CLEARML_ELASTIC_SERVICE_PORT: 9200
      CLEARML_MONGODB_SERVICE_HOST: mongo
      CLEARML_MONGODB_SERVICE_PORT: 27017
      CLEARML_REDIS_SERVICE_HOST: redis
      CLEARML_REDIS_SERVICE_PORT: 6379
      CLEARML_SERVER_DEPLOYMENT_TYPE: linux
      CLEARML__apiserver__pre_populate__enabled: "true"
      CLEARML__apiserver__pre_populate__zip_files: "/opt/clearml/db-pre-populate"
      CLEARML__apiserver__pre_populate__artifacts_path: "/mnt/fileserver"
      CLEARML__services__async_urls_delete__enabled: "true"
      CLEARML__services__async_urls_delete__fileserver__url_prefixes: "[${CLEARML_FILES_HOST:-}]"
      CLEARML__secure__credentials__services_agent__user_key: ${CLEARML_AGENT_ACCESS_KEY:-}
      CLEARML__secure__credentials__services_agent__user_secret: ${CLEARML_AGENT_SECRET_KEY:-}
    ports:
    - "8008:8008"
    networks:
      - backend
      - frontend
    deploy:
      resources: *default-resources

  elasticsearch:
    networks:
      - backend
    container_name: clearml-elastic
    environment:
      bootstrap.memory_lock: "true"
      cluster.name: clearml
      cluster.routing.allocation.node_initial_primaries_recoveries: "500"
      cluster.routing.allocation.disk.watermark.low: 500mb
      cluster.routing.allocation.disk.watermark.high: 500mb
      cluster.routing.allocation.disk.watermark.flood_stage: 500mb
      discovery.type: "single-node"
      http.compression_level: "7"
      node.name: clearml
      reindex.remote.whitelist: "'*.*'"
      xpack.security.enabled: "false"
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    image: docker.elastic.co/elasticsearch/elasticsearch:7.17.18
    restart: unless-stopped
    volumes:
      - /opt/clearml/data/elastic_7:/usr/share/elasticsearch/data
      - /usr/share/elasticsearch/logs
    deploy:
      resources: *default-resources

  fileserver:
    networks:
      - backend
      - frontend
    command:
    - fileserver
    container_name: clearml-fileserver
    image: dockerpull.com/allegroai/clearml:latest
    environment:
      CLEARML__fileserver__delete__allow_batch: "true"
    restart: unless-stopped
    volumes:
    - /opt/clearml/logs:/var/log/clearml
    - /opt/clearml/data/fileserver:/mnt/fileserver
    - /opt/clearml/config:/opt/clearml/config
    ports:
    - "8081:8081"
    deploy:
      resources: *default-resources

  mongo:
    networks:
      - backend
    container_name: clearml-mongo
    image: dockerpull.com/mongo:4.4.29
    restart: unless-stopped
    command: --setParameter internalQueryMaxBlockingSortMemoryUsageBytes=196100200
    volumes:
    - /opt/clearml/data/mongo_4/db:/data/db
    - /opt/clearml/data/mongo_4/configdb:/data/configdb
    deploy:
      resources: *default-resources

  redis:
    networks:
      - backend
    container_name: clearml-redis
    image: dockerpull.com/redis:6.2
    restart: unless-stopped
    volumes:
    - /opt/clearml/data/redis:/data
    deploy:
      resources: *default-resources

  webserver:
    command:
    - webserver
    container_name: clearml-webserver
    # environment:
    #  CLEARML_SERVER_SUB_PATH : clearml-web # Allow Clearml to be served with a URL path prefix.
    image: dockerpull.com/allegroai/clearml:latest
    restart: unless-stopped
    depends_on:
      - apiserver
    ports:
    - "8082:80"
    networks:
      - backend
      - frontend
    deploy:
      resources: *default-resources

  async_delete:
    depends_on:
      - apiserver
      - redis
      - mongo
      - elasticsearch
      - fileserver
    container_name: async_delete
    image: dockerpull.com/allegroai/clearml:latest
    networks:
      - backend
    restart: unless-stopped
    environment:
      CLEARML_ELASTIC_SERVICE_HOST: elasticsearch
      CLEARML_ELASTIC_SERVICE_PORT: 9200
      CLEARML_MONGODB_SERVICE_HOST: mongo
      CLEARML_MONGODB_SERVICE_PORT: 27017
      CLEARML_REDIS_SERVICE_HOST: redis
      CLEARML_REDIS_SERVICE_PORT: 6379
      PYTHONPATH: /opt/clearml/apiserver
      CLEARML__services__async_urls_delete__fileserver__url_prefixes: "[${CLEARML_FILES_HOST:-}]"
    entrypoint:
      - python3
      - -m
      - jobs.async_urls_delete
      - --fileserver-host
      - http://fileserver:8081
    volumes:
      - /opt/clearml/logs:/var/log/clearml
      - /opt/clearml/config:/opt/clearml/config
    deploy:
      resources: *default-resources

  agent-services:
    networks:
      - backend
    container_name: clearml-agent-services
    image: dockerpull.com/allegroai/clearml-agent-services:latest
    deploy:
      restart_policy:
        condition: on-failure
      resources: *default-resources
    privileged: true
    environment:
      CLEARML_HOST_IP: ${CLEARML_HOST_IP}
      CLEARML_WEB_HOST: ${CLEARML_WEB_HOST:-}
      CLEARML_API_HOST: http://apiserver:8008
      CLEARML_FILES_HOST: ${CLEARML_FILES_HOST:-}
      CLEARML_API_ACCESS_KEY: ${CLEARML_AGENT_ACCESS_KEY:-$CLEARML_API_ACCESS_KEY}
      CLEARML_API_SECRET_KEY: ${CLEARML_AGENT_SECRET_KEY:-$CLEARML_API_SECRET_KEY}
      CLEARML_AGENT_GIT_USER: ${CLEARML_AGENT_GIT_USER}
      CLEARML_AGENT_GIT_PASS: ${CLEARML_AGENT_GIT_PASS}
      CLEARML_AGENT_UPDATE_VERSION: ${CLEARML_AGENT_UPDATE_VERSION:->=0.17.0}
      CLEARML_AGENT_DEFAULT_BASE_DOCKER: "ubuntu:18.04"
      AWS_ACCESS_KEY_ID: ${AWS_ACCESS_KEY_ID:-}
      AWS_SECRET_ACCESS_KEY: ${AWS_SECRET_ACCESS_KEY:-}
      AWS_DEFAULT_REGION: ${AWS_DEFAULT_REGION:-}
      AZURE_STORAGE_ACCOUNT: ${AZURE_STORAGE_ACCOUNT:-}
      AZURE_STORAGE_KEY: ${AZURE_STORAGE_KEY:-}
      GOOGLE_APPLICATION_CREDENTIALS: ${GOOGLE_APPLICATION_CREDENTIALS:-}
      CLEARML_WORKER_ID: "clearml-services"
      CLEARML_AGENT_DOCKER_HOST_MOUNT: "/opt/clearml/agent:/root/.clearml"
      SHUTDOWN_IF_NO_ACCESS_KEY: 1
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - /opt/clearml/agent:/root/.clearml
    depends_on:
      - apiserver
    entrypoint: >
      bash -c "curl --retry 10 --retry-delay 10 --retry-connrefused 'http://apiserver:8008/debug.ping' && /usr/agent/entrypoint.sh"

networks:
  backend:
    driver: bridge
  frontend:
    driver: bridge

so where to fix this error?

feicccccccc commented 3 weeks ago

same issue. Cannot found any documentation related to the file-server.

jaffe-fly commented 3 weeks ago

also get this error, only PLOTS no images show, browsers get below error image and the fileserver docker logs is

clearml-fileserver  | [2024-10-30 02:38:52,198] [8] [ERROR] [CLEARML.auth] Error getting token
clearml-fileserver  | [2024-10-30 02:39:51,852] [8] [ERROR] [CLEARML.auth] Error getting token
clearml-fileserver  | [2024-10-30 02:40:43,548] [8] [ERROR] [CLEARML.auth] Error getting token
clearml-fileserver  | [2024-10-30 02:40:48,258] [8] [ERROR] [CLEARML.auth] Error getting token
clearml-fileserver  | [2024-10-30 02:48:12,519] [8] [ERROR] [CLEARML.auth] Error getting token
clearml-fileserver  | [2024-10-30 02:48:18,390] [8] [ERROR] [CLEARML.auth] Error getting token

deloy clearml in docker compose , before start docker compsoe

export CLEARML_AGENT_ACCESS_KEY=xxxxxx
export CLEARML_AGENT_SECRET_KEY=xxxxxx
export CLEARML_HOST_IP=192.168.3.xxx
export CLEARML_AGENT_GIT_USER=xxxxxx
export CLEARML_AGENT_GIT_PASS=xxxxxx
export CLEARML_API_ACCESS_KEY=xxxxxx
export CLEARML_API_SECRET_KEY=xxxxxx

and the docker compose yaml is

version: "3.6"

x-resources: &default-resources
  limits:
    memory: 8G
  reservations:
    memory: 4G

services:

  apiserver:
    command:
    - apiserver
    container_name: clearml-apiserver
    image: dockerpull.com/allegroai/clearml:latest
    restart: unless-stopped
    volumes:
    - /opt/clearml/logs:/var/log/clearml
    - /opt/clearml/config:/opt/clearml/config
    - /opt/clearml/data/fileserver:/mnt/fileserver
    depends_on:
      - redis
      - mongo
      - elasticsearch
      - fileserver
    environment:
      CLEARML_ELASTIC_SERVICE_HOST: elasticsearch
      CLEARML_ELASTIC_SERVICE_PORT: 9200
      CLEARML_MONGODB_SERVICE_HOST: mongo
      CLEARML_MONGODB_SERVICE_PORT: 27017
      CLEARML_REDIS_SERVICE_HOST: redis
      CLEARML_REDIS_SERVICE_PORT: 6379
      CLEARML_SERVER_DEPLOYMENT_TYPE: linux
      CLEARML__apiserver__pre_populate__enabled: "true"
      CLEARML__apiserver__pre_populate__zip_files: "/opt/clearml/db-pre-populate"
      CLEARML__apiserver__pre_populate__artifacts_path: "/mnt/fileserver"
      CLEARML__services__async_urls_delete__enabled: "true"
      CLEARML__services__async_urls_delete__fileserver__url_prefixes: "[${CLEARML_FILES_HOST:-}]"
      CLEARML__secure__credentials__services_agent__user_key: ${CLEARML_AGENT_ACCESS_KEY:-}
      CLEARML__secure__credentials__services_agent__user_secret: ${CLEARML_AGENT_SECRET_KEY:-}
    ports:
    - "8008:8008"
    networks:
      - backend
      - frontend
    deploy:
      resources: *default-resources

  elasticsearch:
    networks:
      - backend
    container_name: clearml-elastic
    environment:
      bootstrap.memory_lock: "true"
      cluster.name: clearml
      cluster.routing.allocation.node_initial_primaries_recoveries: "500"
      cluster.routing.allocation.disk.watermark.low: 500mb
      cluster.routing.allocation.disk.watermark.high: 500mb
      cluster.routing.allocation.disk.watermark.flood_stage: 500mb
      discovery.type: "single-node"
      http.compression_level: "7"
      node.name: clearml
      reindex.remote.whitelist: "'*.*'"
      xpack.security.enabled: "false"
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    image: docker.elastic.co/elasticsearch/elasticsearch:7.17.18
    restart: unless-stopped
    volumes:
      - /opt/clearml/data/elastic_7:/usr/share/elasticsearch/data
      - /usr/share/elasticsearch/logs
    deploy:
      resources: *default-resources

  fileserver:
    networks:
      - backend
      - frontend
    command:
    - fileserver
    container_name: clearml-fileserver
    image: dockerpull.com/allegroai/clearml:latest
    environment:
      CLEARML__fileserver__delete__allow_batch: "true"
    restart: unless-stopped
    volumes:
    - /opt/clearml/logs:/var/log/clearml
    - /opt/clearml/data/fileserver:/mnt/fileserver
    - /opt/clearml/config:/opt/clearml/config
    ports:
    - "8081:8081"
    deploy:
      resources: *default-resources

  mongo:
    networks:
      - backend
    container_name: clearml-mongo
    image: dockerpull.com/mongo:4.4.29
    restart: unless-stopped
    command: --setParameter internalQueryMaxBlockingSortMemoryUsageBytes=196100200
    volumes:
    - /opt/clearml/data/mongo_4/db:/data/db
    - /opt/clearml/data/mongo_4/configdb:/data/configdb
    deploy:
      resources: *default-resources

  redis:
    networks:
      - backend
    container_name: clearml-redis
    image: dockerpull.com/redis:6.2
    restart: unless-stopped
    volumes:
    - /opt/clearml/data/redis:/data
    deploy:
      resources: *default-resources

  webserver:
    command:
    - webserver
    container_name: clearml-webserver
    # environment:
    #  CLEARML_SERVER_SUB_PATH : clearml-web # Allow Clearml to be served with a URL path prefix.
    image: dockerpull.com/allegroai/clearml:latest
    restart: unless-stopped
    depends_on:
      - apiserver
    ports:
    - "8082:80"
    networks:
      - backend
      - frontend
    deploy:
      resources: *default-resources

  async_delete:
    depends_on:
      - apiserver
      - redis
      - mongo
      - elasticsearch
      - fileserver
    container_name: async_delete
    image: dockerpull.com/allegroai/clearml:latest
    networks:
      - backend
    restart: unless-stopped
    environment:
      CLEARML_ELASTIC_SERVICE_HOST: elasticsearch
      CLEARML_ELASTIC_SERVICE_PORT: 9200
      CLEARML_MONGODB_SERVICE_HOST: mongo
      CLEARML_MONGODB_SERVICE_PORT: 27017
      CLEARML_REDIS_SERVICE_HOST: redis
      CLEARML_REDIS_SERVICE_PORT: 6379
      PYTHONPATH: /opt/clearml/apiserver
      CLEARML__services__async_urls_delete__fileserver__url_prefixes: "[${CLEARML_FILES_HOST:-}]"
    entrypoint:
      - python3
      - -m
      - jobs.async_urls_delete
      - --fileserver-host
      - http://fileserver:8081
    volumes:
      - /opt/clearml/logs:/var/log/clearml
      - /opt/clearml/config:/opt/clearml/config
    deploy:
      resources: *default-resources

  agent-services:
    networks:
      - backend
    container_name: clearml-agent-services
    image: dockerpull.com/allegroai/clearml-agent-services:latest
    deploy:
      restart_policy:
        condition: on-failure
      resources: *default-resources
    privileged: true
    environment:
      CLEARML_HOST_IP: ${CLEARML_HOST_IP}
      CLEARML_WEB_HOST: ${CLEARML_WEB_HOST:-}
      CLEARML_API_HOST: http://apiserver:8008
      CLEARML_FILES_HOST: ${CLEARML_FILES_HOST:-}
      CLEARML_API_ACCESS_KEY: ${CLEARML_AGENT_ACCESS_KEY:-$CLEARML_API_ACCESS_KEY}
      CLEARML_API_SECRET_KEY: ${CLEARML_AGENT_SECRET_KEY:-$CLEARML_API_SECRET_KEY}
      CLEARML_AGENT_GIT_USER: ${CLEARML_AGENT_GIT_USER}
      CLEARML_AGENT_GIT_PASS: ${CLEARML_AGENT_GIT_PASS}
      CLEARML_AGENT_UPDATE_VERSION: ${CLEARML_AGENT_UPDATE_VERSION:->=0.17.0}
      CLEARML_AGENT_DEFAULT_BASE_DOCKER: "ubuntu:18.04"
      AWS_ACCESS_KEY_ID: ${AWS_ACCESS_KEY_ID:-}
      AWS_SECRET_ACCESS_KEY: ${AWS_SECRET_ACCESS_KEY:-}
      AWS_DEFAULT_REGION: ${AWS_DEFAULT_REGION:-}
      AZURE_STORAGE_ACCOUNT: ${AZURE_STORAGE_ACCOUNT:-}
      AZURE_STORAGE_KEY: ${AZURE_STORAGE_KEY:-}
      GOOGLE_APPLICATION_CREDENTIALS: ${GOOGLE_APPLICATION_CREDENTIALS:-}
      CLEARML_WORKER_ID: "clearml-services"
      CLEARML_AGENT_DOCKER_HOST_MOUNT: "/opt/clearml/agent:/root/.clearml"
      SHUTDOWN_IF_NO_ACCESS_KEY: 1
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - /opt/clearml/agent:/root/.clearml
    depends_on:
      - apiserver
    entrypoint: >
      bash -c "curl --retry 10 --retry-delay 10 --retry-connrefused 'http://apiserver:8008/debug.ping' && /usr/agent/entrypoint.sh"

networks:
  backend:
    driver: bridge
  frontend:
    driver: bridge

so where to fix this error?

fix it by image

feicccccccc commented 3 weeks ago

This approach disable file server auth service. Another fix is to set the correct cookie domain.

jaffe-fly commented 3 weeks ago

This approach disable file server auth service. Another fix is to set the correct cookie domain.

how to set?

jkhenning commented 3 weeks ago

Hi @jaffe-fly, see here

jaffe-fly commented 3 weeks ago

self hosted no domain,only has ip + port,still dont know how to set

jkhenning commented 2 weeks ago

If that's the case try using the IP as the value to the cookie domain