temporalio / docker-builds

Temporal service Docker images build
https://hub.docker.com/r/temporaliotest/auto-setup
MIT License
30 stars 59 forks source link

[Bug] curl in auto-setup script does not resolve elasticsearch when search domains are present #242

Open rocketraman opened 2 months ago

rocketraman commented 2 months ago

What are you really trying to do?

Run the default docker-compose.yml which uses postgresql and elasticsearch successfully.

Describe the bug

The auto-setup script uses curl to determine if elasticsearch is up. This does not appear to work correctly when the /etc/resolv.conf created by Docker inside the container contains search domains, as it might if they are inherited from the host. Here is my resolv.conf:

# Generated by Docker Engine.
# This file can be edited; Docker Engine will not make further changes once it
# has been modified.

nameserver 127.0.0.11
search home.arpa foo.com
options edns0 trust-ad ndots:0

# Based on host file: '/etc/resolv.conf' (internal resolver)
# ExtServers: [host(127.0.0.53)]
# Overrides: []
# Option ndots from: internal

And we can see the behavior of curl when using the non-qualified name elasticsearch:

# curl http://elasticsearch:9200
curl: (6) Could not resolve host: elasticsearch

But appending a . to make it fully-qualified works:

# curl http://elasticsearch.:9200
{
  "name" : "12a8df58f9f5",
  "cluster_name" : "docker-cluster",
  "cluster_uuid" : "P8wVIJxQTkuTcaNWRNUPRg",
  "version" : {
    "number" : "7.16.2",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "2b937c44140b6559905130a8650c64dbd0879cfb",
    "build_date" : "2021-12-18T19:42:46.604893745Z",
    "build_snapshot" : false,
    "lucene_version" : "8.10.1",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

Also note that this seems specific to curl -- on the same image ping elasticsearch and nslookup elasticsearch and dig -t a elasticsearch @127.0.0.11 all work perfectly fine.

# curl --version
curl 8.5.0 (x86_64-alpine-linux-musl) libcurl/8.5.0 OpenSSL/3.1.4 zlib/1.3.1 brotli/1.1.0 c-ares/1.24.0 libidn2/2.3.4 nghttp2/1.58.0
Release-Date: 2023-12-06
Protocols: dict file ftp ftps gopher gophers http https imap imaps mqtt pop3 pop3s rtsp smb smbs smtp smtps telnet tftp ws wss
Features: alt-svc AsynchDNS brotli HSTS HTTP2 HTTPS-proxy IDN IPv6 Largefile libz NTLM SSL threadsafe TLS-SRP UnixSockets

So we can see that curl is using ares/1.24.0 for DNS resolution, which means we are probably hitting issue https://github.com/c-ares/c-ares/issues/734, which should be resolved in 1.31.0. It appears that even alpine 3.20 does not use that version yet.

The result is that the temporal container continuously reports:

Waiting for Elasticsearch to start up.

The workaround appears to be to remove the search domains from resolv.conf by appending dns_search: "" to the compose options for the temporal container.

However, perhaps doing the check for ES being up with something other than curl might be a solution if other people are hitting this as well?

Minimal Reproduction

Start docker-compose where the host /etc/resolv.conf contains search domains.

Environment/Versions