argilla-io / argilla

Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
https://docs.argilla.io
Apache License 2.0
3.81k stars 357 forks source link

[BUG-python/deployment] default docker compose setup does not create a workspace by default #5505

Open MoritzLaurer opened 3 days ago

MoritzLaurer commented 3 days ago

Describe the bug [Not urgent]

I've set up a local argilla deployment with docker compose following these docs: https://docs.argilla.io/latest/getting_started/how-to-deploy-argilla-with-docker/

From the HF Hub Argilla Spaces, I am used to there already being a default workspace and I don't have to create one myself. With this local deployment, there is no default workspace and users have to create one themselves with:

workspace_name = "argilla"
workspace_to_create = rg.Workspace(name=workspace_name)
workspace = workspace_to_create.create()

Expected behavior

Consistent behaviour with Argilla on HF Hub: having a default workspace as part of the recommended local docker compose setup. Or maybe this is intentional and there is a specific reason for not having a default workspace with the local deployment?

Environment: Argilla version 2.1.0

default docker-compose.yaml:

services:
  argilla:
    image: argilla/argilla-server:latest
    restart: unless-stopped
    ports:
      - "6900:6900"
    environment:
      ARGILLA_HOME_PATH: /var/lib/argilla
      ARGILLA_ELASTICSEARCH: http://elasticsearch:9200

      # HF_HUB_DISABLE_TELEMETRY: 1 # Opt-out for telemetry https://huggingface.co/docs/huggingface_hub/main/en/package_reference/utilities#huggingface_hub.utils.send_telemetry
      # HF_HUB_OFFLINE: 1 # Opt-out for telemetry https://huggingface.co/docs/huggingface_hub/main/en/package_reference/utilities#huggingface_hub.utils.send_telemetry

      USERNAME: argilla
      PASSWORD: 12345678
      API_KEY: argilla.apikey
      # REINDEX_DATASETS: 1 # Uncomment this line to reindex Argilla datasets into the search engine when starting up

    networks:
      - argilla
    volumes:
      # ARGILLA_HOME_PATH is used to define where Argilla will save it's application data.
      # If you change ARGILLA_HOME_PATH value please copy that same value to argilladata volume too.
      - argilladata:/var/lib/argilla

  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:8.12.2
    environment:
      - node.name=elasticsearch
      - cluster.name=es-argilla-local
      - discovery.type=single-node
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
      - cluster.routing.allocation.disk.threshold_enabled=false
      - xpack.security.enabled=false
    ulimits:
      memlock:
        soft: -1
        hard: -1
    networks:
      - argilla
    ports:
      - "9200:9200"
      - "9300:9300"
    volumes:
      - elasticdata:/usr/share/elasticsearch/data/

networks:
  argilla:
    driver: bridge

volumes:
  argilladata:
  elasticdata:

Additional context

frascuchon commented 2 days ago

Thanks @MoritzLaurer,

Yes, this capability should be also available for the argilla-server image. We will include this for the 2.3.0 release.