eikek / docspell

Assist in organizing your piles of documents, resulting from scanners, e-mails and other sources with miminal effort.
https://docspell.org
GNU Affero General Public License v3.0
1.63k stars 124 forks source link

Invalid hostname #2580

Closed daviddavo closed 6 months ago

daviddavo commented 7 months ago

I have the following two errors from the rest server

2024.04.03 13:42:45:0000 [io-comp...] [ERROR] docspell.pubsub.naive.NaivePubSub.publishRemote:173 - Error publishing jobs-notify message remotely
org.http4s.ember.client.internal.ClientHelpers$MissingOrInvalidHost: Invalid hostname: docspell_joex.1.kzj0mvt4ty2xwzqyhuufxwl28
    at org.http4s.ember.client.internal.ClientHelpers$.$anonfun$getAddress$2(ClientHelpers.scala:243)
    at cats.ApplicativeError$LiftFromOptionPartially$.apply$extension(ApplicativeError.scala:371)
    at cats.syntax.OptionOps$LiftToPartiallyApplied.apply(option.scala:395)
    at org.http4s.ember.client.internal.ClientHelpers$.getAddress(ClientHelpers.scala:243)
    at org.http4s.ember.client.internal.ClientHelpers$.requestKeyToSocketWithKey(ClientHelpers.scala:103)
    at org.http4s.ember.client.EmberClientBuilder.$anonfun$build$8(EmberClientBuilder.scala:263)
    at org.typelevel.keypool.KeyPool$.$anonfun$take$6(KeyPool.scala:302)
    at scala.Option.fold(Option.scala:263)
    at org.typelevel.keypool.KeyPool$.$anonfun$take$5(KeyPool.scala:302)
    at flatMap @ org.http4s.ember.client.internal.ClientHelpers$.getAddress(ClientHelpers.scala:243)
    at make @ docspell.common.ThreadFactories$.executorResource(ThreadFactories.scala:48)
    at make @ docspell.common.ThreadFactories$.executorResource(ThreadFactories.scala:48)
    at ref @ fs2.concurrent.Topic$.apply(Topic.scala:154)
    at modify @ fs2.internal.Scope.close(Scope.scala:262)
    at onError$extension @ org.typelevel.keypool.KeyPool$Builder.keepRunning$1(KeyPool.scala:370)

2024.04.03 13:42:45:0001 [io-comp...] [ERROR] docspell.pubsub.naive.NaivePubSub.publishRemote:173 - Error publishing jobs-notify message remotely
org.http4s.ember.client.internal.ClientHelpers$MissingOrInvalidHost: Invalid hostname: docspell_joex.2.lypgx6bu8y3a4qziifxrw6brv
    at org.http4s.ember.client.internal.ClientHelpers$.$anonfun$getAddress$2(ClientHelpers.scala:243)
    at cats.ApplicativeError$LiftFromOptionPartially$.apply$extension(ApplicativeError.scala:371)
    at cats.syntax.OptionOps$LiftToPartiallyApplied.apply(option.scala:395)
    at org.http4s.ember.client.internal.ClientHelpers$.getAddress(ClientHelpers.scala:243)
    at org.http4s.ember.client.internal.ClientHelpers$.requestKeyToSocketWithKey(ClientHelpers.scala:103)
    at org.http4s.ember.client.EmberClientBuilder.$anonfun$build$8(EmberClientBuilder.scala:263)
    at org.typelevel.keypool.KeyPool$.$anonfun$take$6(KeyPool.scala:302)
    at scala.Option.fold(Option.scala:263)
    at org.typelevel.keypool.KeyPool$.$anonfun$take$5(KeyPool.scala:302)
    at flatMap @ org.http4s.ember.client.internal.ClientHelpers$.getAddress(ClientHelpers.scala:243)
    at make @ docspell.common.ThreadFactories$.executorResource(ThreadFactories.scala:48)
    at make @ docspell.common.ThreadFactories$.executorResource(ThreadFactories.scala:48)
    at ref @ fs2.concurrent.Topic$.apply(Topic.scala:154)
    at modify @ fs2.internal.Scope.close(Scope.scala:262)
    at onError$extension @ org.typelevel.keypool.KeyPool$Builder.keepRunning$1(KeyPool.scala:370)

They say that the hostnames docspell_joex.1.kzj0mvt4ty2xwzqyhuufxwl28 and docspell_joex.2.lypgx6bu8y3a4qziifxrw6brv are invalid, but they DO exist and are accessible from the container that hosts the rest server:

$ container_name=docspell_restserver.1.o6zkb13y053m47wqirxasvdst
$ docker exec $container_name getent hosts docspell_joex.1.kzj0mvt4ty2xwzqyhuufxwl28
10.0.5.30         docspell_joex.1.kzj0mvt4ty2xwzqyhuufxwl28  docspell_joex.1.kzj0mvt4ty2xwzqyhuufxwl28
$ docker exec $container_name docspell_joex.2.lypgx6bu8y3a4qziifxrw6brv
10.0.5.31         docspell_joex.2.lypgx6bu8y3a4qziifxrw6brv  docspell_joex.2.lypgx6bu8y3a4qziifxrw6brv
$ docker exec $container_name ping -c4 docspell_joex.1.kzj0mvt4ty2xwzqyhuufxwl28
PING docspell_joex.1.kzj0mvt4ty2xwzqyhuufxwl28 (10.0.5.30): 56 data bytes
64 bytes from 10.0.5.30: seq=0 ttl=64 time=0.113 ms
64 bytes from 10.0.5.30: seq=1 ttl=64 time=0.120 ms
64 bytes from 10.0.5.30: seq=2 ttl=64 time=0.136 ms
64 bytes from 10.0.5.30: seq=3 ttl=64 time=0.139 ms

--- docspell_joex.1.kzj0mvt4ty2xwzqyhuufxwl28 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.113/0.127/0.139 ms
$ docker exec $container_name ping -c4 docspell_joex.2.lypgx6bu8y3a4qziifxrw6brv
PING docspell_joex.2.lypgx6bu8y3a4qziifxrw6brv (10.0.5.31): 56 data bytes
64 bytes from 10.0.5.31: seq=0 ttl=64 time=0.668 ms
64 bytes from 10.0.5.31: seq=1 ttl=64 time=0.712 ms
64 bytes from 10.0.5.31: seq=2 ttl=64 time=0.776 ms
64 bytes from 10.0.5.31: seq=3 ttl=64 time=0.787 ms

--- docspell_joex.2.lypgx6bu8y3a4qziifxrw6brv ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.668/0.735/0.787 ms
eikek commented 7 months ago

Thanks for reporting. If I remember correctly, host names should not use underscore; or is this outdated and it is allowed now (it seems like it, since you have it working :)). I think the library doesn't like the underscore in the host name.

daviddavo commented 7 months ago

Underscores are used in prepending a domain that is not a hostname (TXT and SRV records).

https://domainkeys.sourceforge.net/underscore.html

But using it for a hostname indeed violates RFC. Here's the related issue from docker:

It seems there are many libraries affected... I'll try using a network alias for the service instead, and check how it works.

Note that this is just so adding the documents is faster, as they are added after a while thanks to the checks being made periodically.

eikek commented 7 months ago

Yes, for domain names underscores are fine, but not for hostnames as I thought. I think this can then wait for the docker issue to be resolved.

daviddavo commented 6 months ago

I think I solved it by using network aliases. This is my current, over-complicated docker-compose file for a docker swarm:

version: "3.8"
volumes:
  docspell-postgres_data:
  docspell-solr_data:
networks:
  overlay_proxy:
    external: True
    driver: overlay
  # Communicate with containers deployed
  # through other means
  overlay_docspell:
    external: True
    driver: overlay
services:
  # The restserver and joex containers defined here are configured
  # using env variables. Both must connect to the same database and
  # solr instance. More information on configuring can be found here:
  # https://docspell.org/docs/configure
  #
  # Please replace the values of the following with a custom secret
  # string:
  #
  # - DOCSPELL_SERVER_ADMIN__ENDPOINT_SECRET
  # - DOCSPELL_SERVER_AUTH_SERVER__SECRET
  # - DOCSPELL_SERVER_INTEGRATION__ENDPOINT_HTTP__HEADER_HEADER__VALUE
  #   (use the same value at the consumedir container!)
  #
  # After creating an account, you may want to set signup mode to
  # "closed" or to "invite". When using invite, you must also set
  # DOCSPELL_SERVER_BACKEND_SIGNUP_NEW__INVITE__PASSWORD to some
  # secret.
  restserver:
    image: docspell/restserver:latest
    restart: unless-stopped
    networks:
      overlay_proxy:
      overlay_docspell:
        aliases:
          - docspell-restserver
    # ports:
    #   - "7880:7880"
    environment:
      - TZ=Europe/Berlin
      - DOCSPELL_SERVER_APP__ID=rest-{{.Node.Hostname}}
      - DOCSPELL_SERVER_INTERNAL__URL=http://docspell-restserver:7880
      - DOCSPELL_SERVER_ADMIN__ENDPOINT_SECRET=...
      - DOCSPELL_SERVER_AUTH_SERVER__SECRET=
      - DOCSPELL_SERVER_BACKEND_JDBC_PASSWORD=...
      - DOCSPELL_SERVER_BACKEND_JDBC_URL=jdbc:postgresql://docspell_db:5432/dbname
      - DOCSPELL_SERVER_BACKEND_JDBC_USER=dbuser
      # - "DOCSPELL_SERVER_BIND_ADDRESS=::"
      - DOCSPELL_SERVER_BIND_ADDRESS=0.0.0.0
      - DOCSPELL_SERVER_FULL__TEXT__SEARCH_ENABLED=true
      - DOCSPELL_SERVER_FULL__TEXT__SEARCH_SOLR_URL=http://solr:8983/solr/docspell
      - DOCSPELL_SERVER_INTEGRATION__ENDPOINT_ENABLED=true
      - DOCSPELL_SERVER_INTEGRATION__ENDPOINT_HTTP__HEADER_ENABLED=true
      - DOCSPELL_SERVER_INTEGRATION__ENDPOINT_HTTP__HEADER_HEADER__VALUE=...
      - DOCSPELL_SERVER_BACKEND_SIGNUP_MODE=invite
      - DOCSPELL_SERVER_BACKEND_SIGNUP_NEW__INVITE__PASSWORD=...
      - DOCSPELL_SERVER_BACKEND_ADDONS_ENABLED=false
    depends_on:
      - solr
      - db

  joex:
    image: docspell/joex:latest
    ## For more memory add corresponding arguments, like below. Also see
    ## https://docspell.org/docs/configure/#jvm-options
    # command:
    #   - -J-Xmx2G
    restart: unless-stopped
    networks:
      overlay_docspell:
        aliases:
          - docspell-joex
    environment:
      - TZ=Europe/Berlin
      - DOCSPELL_JOEX_APP__ID=joex-{{.Node.Hostname}}
      - DOCSPELL_JOEX_PERIODIC__SCHEDULER_NAME=joex-{{.Node.Hostname}}
      - DOCSPELL_JOEX_SCHEDULER_NAME=joex-{{.Node.Hostname}}
      # - DOCSPELL_JOEX_BASE__URL=http://{{.Task.Name}}:7878
      - DOCSPELL_JOEX_BASE__URL=http://docspell-joex:7878
      # - "DOCSPELL_JOEX_BIND_ADDRESS=::"
      - DOCSPELL_SERVER_BIND_ADDRESS=0.0.0.0
      - DOCSPELL_JOEX_FULL__TEXT__SEARCH_ENABLED=true
      - DOCSPELL_JOEX_FULL__TEXT__SEARCH_SOLR_URL=http://solr:8983/solr/docspell
      - DOCSPELL_JOEX_JDBC_PASSWORD=...
      - DOCSPELL_JOEX_JDBC_URL=jdbc:postgresql://docspell_db:5432/dbname
      - DOCSPELL_JOEX_JDBC_USER=dbuser
      - DOCSPELL_JOEX_ADDONS_EXECUTOR__CONFIG_RUNNER=docker,trivial
      - DOCSPELL_JOEX_CONVERT_HTML__CONVERTER=weasyprint
      # - DOCSPELL_JOEX_TEXT-ANALYSIS__NLP__MODE=regexonly
      - DOCSPELL_JOEX_SCHEDULER__POOL-SIZE=2
    # ports:
    #   - "7878:7878"
    depends_on:
      - solr
      - db
    deploy:
      mode: replicated
      replicas: 3
      placement:
        max_replicas_per_node: 1
        constraints:
          - node.labels.transcode_worker == true
      resources:
        limits:
          cpus: '2'
    ## Uncomment when using the "docker" runner with addons
    # volumes:
    #   - /var/run/docker.sock:/var/run/docker.sock
    #   - /tmp:/tmp

  # Use the following command to check number of connections
  # psql -U dbuser dbname -c 'select COUNT(*) from pg_stat_activity'
  db:
    image: postgres:16.2
    container_name: postgres_db
    restart: unless-stopped
    networks:
      - overlay_docspell
    volumes:
      - docspell-postgres_data:/var/lib/postgresql/data/
    environment:
      - POSTGRES_USER=dbuser
      - POSTGRES_PASSWORD=...
      - POSTGRES_DB=dbname
    command: "-c max_connections=200"
    deploy:
      placement:
        constraints:
          - node.labels.has_docspell_db == true
    healthcheck:
      # test: ["CMD-SHELL", "pg_isready -ddbname -U dbuser"]
      test: ["CMD-SHELL", "psql -U dbuser dbname -c 'select COUNT(*) from pg_stat_activity'"]
      interval: 10s
      timeout: 5s
      retries: 5

  solr:
    image: solr:9
    container_name: docspell-solr
    restart: unless-stopped
    networks:
      - overlay_docspell
    volumes:
      - docspell-solr_data:/var/solr
    command:
      - bash
      - -c
      - 'precreate-core docspell; exec solr -f -Dsolr.modules=analysis-extras'
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8983/solr/docspell/admin/ping"]
      interval: 1m
      timeout: 10s
      retries: 2
      start_period: 30s
# vim: noai:ts=2:sw=2
eikek commented 6 months ago

Nice! So I think this issue can be closed, right?