I am trying to setup a basic 3 node cluster with minimal changes to helm values.
However, all nodes keeps failing with errors like these:
++ hostname
3/20/2021 1:10:18 PM + exec /cockroach/cockroach start --join=k-preprod-cockroachdb-0.k-preprod-cockroachdb.k-db.svc.cluster.local:26257,k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local:26257,k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local:26257 --advertise-host=k-preprod-cockroachdb-0.k-preprod-cockroachdb.k-db.svc.cluster.local --cluster-name=k-preprod --logtostderr=INFO --certs-dir=/cockroach/cockroach-certs/ --http-port=8080 --port=26257 --cache=25% --max-sql-memory=25% --locality=country=us,region=west,state=washington,city=seattle
3/20/2021 1:10:19 PM I210320 20:10:19.908769 1 util/log/flags.go:116 stderr capture started
3/20/2021 1:10:19 PM I210320 20:10:19.921024 1 cli/start.go:1168 ⋮ ‹CockroachDB CCL v20.2.6 (x86_64-unknown-linux-gnu, built 2021/03/15 16:04:08, go1.13.14)›
3/20/2021 1:10:19 PM I210320 20:10:19.987607 1 util/cgroups/cgroups.go:460 ⋮ running in a container; setting GOMAXPROCS to 1
3/20/2021 1:10:20 PM I210320 20:10:20.007727 1 server/config.go:428 ⋮ system total memory: ‹256 MiB›
3/20/2021 1:10:20 PM I210320 20:10:20.008056 1 server/config.go:430 ⋮ server configuration:
3/20/2021 1:10:20 PM ‹max offset 500000000›
3/20/2021 1:10:20 PM ‹cache size 64 MiB›
3/20/2021 1:10:20 PM ‹SQL memory pool size 64 MiB›
3/20/2021 1:10:20 PM ‹scan interval 10m0s›
3/20/2021 1:10:20 PM ‹scan min idle time 10ms›
3/20/2021 1:10:20 PM ‹scan max idle time 1s›
3/20/2021 1:10:20 PM ‹event log enabled true›
3/20/2021 1:10:20 PM I210320 20:10:20.008395 1 cli/start.go:965 ⋮ using local environment variables: ‹COCKROACH_CHANNEL=kubernetes-helm›
3/20/2021 1:10:20 PM I210320 20:10:20.008515 1 cli/start.go:972 ⋮ process identity: ‹uid 0 euid 0 gid 0 egid 0›
3/20/2021 1:10:20 PM I210320 20:10:20.079634 1 cli/start.go:511 ⋮ GEOS loaded from directory ‹/usr/local/lib/cockroach›
3/20/2021 1:10:20 PM I210320 20:10:20.080034 1 cli/start.go:516 ⋮ starting cockroach node
3/20/2021 1:10:20 PM I210320 20:10:20.081760 37 rpc/tls.go:270 ⋮ [n?] server certificate addresses: ‹IP=127.0.0.1; DNS=localhost,k-preprod-cockroachdb-0.k-preprod-cockroachdb.k-db.svc.cluster.local,k-preprod-cockroachdb-0.k-preprod-cockroachdb,k-preprod-cockroachdb-public,k-preprod-cockroachdb-public.k-db.svc.cluster.local; CN=node›
3/20/2021 1:10:20 PM I210320 20:10:20.082084 37 rpc/tls.go:319 ⋮ [n?] web UI certificate addresses: ‹IP=127.0.0.1; DNS=localhost,k-preprod-cockroachdb-0.k-preprod-cockroachdb.k-db.svc.cluster.local,k-preprod-cockroachdb-0.k-preprod-cockroachdb,k-preprod-cockroachdb-public,k-preprod-cockroachdb-public.k-db.svc.cluster.local; CN=node›
3/20/2021 1:10:20 PM I210320 20:10:20.105411 37 vendor/github.com/cockroachdb/pebble/version_set.go:142 ⋮ [n?] [JOB 1] MANIFEST created 000001
3/20/2021 1:10:20 PM I210320 20:10:20.109789 37 vendor/github.com/cockroachdb/pebble/open.go:295 ⋮ [n?] [JOB 1] WAL created 000002
3/20/2021 1:10:20 PM I210320 20:10:20.179600 48 vendor/github.com/cockroachdb/pebble/table_stats.go:118 ⋮ [n?] [JOB 2] all initial table stats loaded
3/20/2021 1:10:20 PM I210320 20:10:20.384074 37 server/server.go:790 ⋮ [n?] monitoring forward clock jumps based on server.clock.forward_jump_check_enabled
3/20/2021 1:10:20 PM I210320 20:10:20.402901 37 vendor/github.com/cockroachdb/pebble/compaction.go:1561 ⋮ [n?] [JOB 1] flushing: sstable created 000004
3/20/2021 1:10:20 PM I210320 20:10:20.411344 37 vendor/github.com/cockroachdb/pebble/open.go:295 ⋮ [n?] [JOB 1] WAL created 000005
3/20/2021 1:10:20 PM I210320 20:10:20.424900 37 vendor/github.com/cockroachdb/pebble/version_set.go:442 ⋮ [n?] [JOB 1] MANIFEST created 000006
3/20/2021 1:10:20 PM I210320 20:10:20.484591 37 vendor/github.com/cockroachdb/pebble/compaction.go:2300 ⋮ [n?] [JOB 1] WAL deleted 000002
3/20/2021 1:10:20 PM I210320 20:10:20.485025 37 vendor/github.com/cockroachdb/pebble/compaction.go:2307 ⋮ [n?] [JOB 1] MANIFEST deleted 000001
3/20/2021 1:10:20 PM I210320 20:10:20.485303 37 server/config.go:619 ⋮ [n?] 1 storage engine‹› initialized
3/20/2021 1:10:20 PM I210320 20:10:20.485482 37 server/config.go:622 ⋮ [n?] ‹Pebble cache size: 64 MiB›
3/20/2021 1:10:20 PM I210320 20:10:20.485592 37 server/config.go:622 ⋮ [n?] ‹store 0: RocksDB, max size 0 B, max open file limit 1043576›
3/20/2021 1:10:20 PM I210320 20:10:20.486129 85 vendor/github.com/cockroachdb/pebble/table_stats.go:118 ⋮ [n?] [JOB 2] all initial table stats loaded
3/20/2021 1:10:20 PM I210320 20:10:20.486348 86 vendor/github.com/cockroachdb/pebble/compaction.go:1371 ⋮ [n?] [JOB 3] compacting L0 [000004] (1.0 K) + L6 [] (0 B)
3/20/2021 1:10:20 PM I210320 20:10:20.491032 86 vendor/github.com/cockroachdb/pebble/compaction.go:1410 ⋮ [n?] [JOB 3] compacted L0 [000004] (1.0 K) + L6 [] (0 B) -> L6 [000004] (1.0 K), in 0.0s, output rate 120 M/s
3/20/2021 1:10:20 PM I210320 20:10:20.492244 37 util/log/log.go:50 ⋮ initial startup completed
3/20/2021 1:10:20 PM Node will now attempt to join a running cluster, or wait for `cockroach init`.
3/20/2021 1:10:20 PM Client connections will be accepted after this completes successfully.
3/20/2021 1:10:20 PM Check the log file(s) for progress.
3/20/2021 1:10:20 PM I210320 20:10:20.492517 37 server/init.go:208 ⋮ [n?] no stores bootstrapped
3/20/2021 1:10:20 PM I210320 20:10:20.492657 37 server/init.go:209 ⋮ [n?] awaiting `cockroach init` or join with an already initialized node
3/20/2021 1:10:20 PM W210320 20:10:20.591246 98 vendor/google.golang.org/grpc/internal/channelz/logging.go:73 ⋮ ‹grpc: addrConn.createTransport failed to connect to {k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local:26257 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host". Reconnecting...›
3/20/2021 1:10:20 PM W210320 20:10:20.591708 96 server/init.go:436 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:20 PM W210320 20:10:20.599864 109 vendor/google.golang.org/grpc/internal/channelz/logging.go:73 ⋮ ‹grpc: addrConn.createTransport failed to connect to {k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local:26257 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host". Reconnecting...›
3/20/2021 1:10:20 PM W210320 20:10:20.600290 96 server/init.go:436 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:21 PM W210320 20:10:21.611737 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:22 PM W210320 20:10:22.644896 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:23 PM W210320 20:10:23.652398 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:24 PM W210320 20:10:24.686188 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:25 PM W210320 20:10:25.609230 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:26 PM W210320 20:10:26.607472 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:27 PM W210320 20:10:27.612530 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:28 PM W210320 20:10:28.609428 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:29 PM W210320 20:10:29.608309 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:30 PM W210320 20:10:30.610507 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:31 PM W210320 20:10:31.612021 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:32 PM W210320 20:10:32.609341 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:33 PM W210320 20:10:33.608949 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:34 PM W210320 20:10:34.608625 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:35 PM W210320 20:10:35.607813 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:36 PM W210320 20:10:36.608642 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:37 PM W210320 20:10:37.614025 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:38 PM W210320 20:10:38.620759 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:39 PM W210320 20:10:39.772168 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:40 PM W210320 20:10:40.634994 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:42 PM W210320 20:10:42.471977 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:42 PM W210320 20:10:42.872882 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:43 PM W210320 20:10:43.611940 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:44 PM W210320 20:10:44.607079 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:45 PM W210320 20:10:45.702508 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:46 PM W210320 20:10:46.609487 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:47 PM W210320 20:10:47.608628 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:48 PM W210320 20:10:48.607700 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:49 PM W210320 20:10:49.612551 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:50 PM W210320 20:10:50.492364 248 cli/start.go:497 ⋮ The server appears to be unable to contact the other nodes in the cluster. Please try:
3/20/2021 1:10:50 PM
3/20/2021 1:10:50 PM - starting the other nodes, if you haven't already;
3/20/2021 1:10:50 PM - double-checking that the '--join' and '--listen'/'--advertise' flags are set up correctly;
3/20/2021 1:10:50 PM - running the 'cockroach init' command if you are trying to initialize a new cluster.
3/20/2021 1:10:50 PM
3/20/2021 1:10:50 PM If problems persist, please see ‹https://www.cockroachlabs.com/docs/v20.2/cluster-setup-troubleshooting.html›.
3/20/2021 1:10:50 PM W210320 20:10:50.636496 250 vendor/google.golang.org/grpc/internal/channelz/logging.go:73 ⋮ ‹grpc: addrConn.createTransport failed to connect to {k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local:26257 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host". Reconnecting...›
3/20/2021 1:10:50 PM W210320 20:10:50.636674 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-2.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
3/20/2021 1:10:51 PM W210320 20:10:51.608271 254 vendor/google.golang.org/grpc/internal/channelz/logging.go:73 ⋮ ‹grpc: addrConn.createTransport failed to connect to {k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local:26257 <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host". Reconnecting...›
3/20/2021 1:10:51 PM W210320 20:10:51.608411 96 server/init.go:474 ⋮ [n?] outgoing join rpc to ‹k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup k-preprod-cockroachdb-1.k-preprod-cockroachdb.k-db.svc.cluster.local: no such host"›
Here is my helm config:
image:
repository: cockroachdb/cockroach
tag: v20.2.6
pullPolicy: IfNotPresent
credentials:
{}
# registry: docker.io
# username: john_doe
# password: changeme
# Additional labels to apply to all Kubernetes resources created by this chart.
labels:
{}
# app.kubernetes.io/part-of: my-app
# Cluster's default DNS domain.
# You should overwrite it if you're using a different one,
# otherwise CockroachDB nodes discovery won't work.
clusterDomain: cluster.local
conf:
# An ordered list of CockroachDB node attributes.
# Attributes are arbitrary strings specifying machine capabilities.
# Machine capabilities might include specialized hardware or number of cores
# (e.g. "gpu", "x16c").
attrs:
[]
# - x16c
# - gpu
# Total size in bytes for caches, shared evenly if there are multiple
# storage devices. Size suffixes are supported (e.g. `1GB` and `1GiB`).
# A percentage of physical memory can also be specified (e.g. `.25`).
cache: 25%
# Sets a name to verify the identity of a cluster.
# The value must match between all nodes specified via `conf.join`.
# This can be used as an additional verification when either the node or
# cluster, or both, have not yet been initialized and do not yet know their
# cluster ID.
# To introduce a cluster name into an already-initialized cluster, pair this
# option with `conf.disable-cluster-name-verification: yes`.
cluster-name: "k-preprod"
# Tell the server to ignore `conf.cluster-name` mismatches.
# This is meant for use when opting an existing cluster into starting to use
# cluster name verification, or when changing the cluster name.
# The cluster should be restarted once with `conf.cluster-name` and
# `conf.disable-cluster-name-verification: yes` combined, and once all nodes
# have been updated to know the new cluster name, the cluster can be restarted
# again with `conf.disable-cluster-name-verification: no`.
# This option has no effect if `conf.cluster-name` is not specified.
disable-cluster-name-verification: false
# The addresses for connecting a CockroachDB nodes to an existing cluster.
# If you are deploying a second CockroachDB instance that should join a first
# one, use the below list to join to the existing instance.
# Each item in the array should be a FQDN (and port if needed) resolvable by
# new Pods.
join: []
# Logs at or above this threshold to STDERR.
logtostderr: INFO
# Maximum storage capacity available to store temporary disk-based data for
# SQL queries that exceed the memory budget (e.g. join, sorts, etc are
# sometimes able to spill intermediate results to disk).
# Accepts numbers interpreted as bytes, size suffixes (e.g. `32GB` and
# `32GiB`) or a percentage of disk size (e.g. `10%`).
# The location of the temporary files is within the first store dir.
# If expressed as a percentage, `max-disk-temp-storage` is interpreted
# relative to the size of the storage device on which the first store is
# placed. The temp space usage is never counted towards any store usage
# (although it does share the device with the first store) so, when
# configuring this, make sure that the size of this temp storage plus the size
# of the first store don't exceed the capacity of the storage device.
# If the first store is an in-memory one (i.e. `type=mem`), then this
# temporary "disk" data is also kept in-memory.
# A percentage value is interpreted as a percentage of the available internal
# memory.
# max-disk-temp-storage: 0GB
# Maximum allowed clock offset for the cluster. If observed clock offsets
# exceed this limit, servers will crash to minimize the likelihood of
# reading inconsistent data. Increasing this value will increase the time
# to recovery of failures as well as the frequency of uncertainty-based
# read restarts.
# Note, that this value must be the same on all nodes in the cluster.
# In order to change it, all nodes in the cluster must be stopped
# simultaneously and restarted with the new value.
# max-offset: 500ms
# Maximum memory capacity available to store temporary data for SQL clients,
# including prepared queries and intermediate data rows during query
# execution. Accepts numbers interpreted as bytes, size suffixes
# (e.g. `1GB` and `1GiB`) or a percentage of physical memory (e.g. `.25`).
max-sql-memory: 25%
# An ordered, comma-separated list of key-value pairs that describe the
# topography of the machine. Topography might include country, datacenter
# or rack designations. Data is automatically replicated to maximize
# diversities of each tier. The order of tiers is used to determine
# the priority of the diversity, so the more inclusive localities like
# country should come before less inclusive localities like datacenter.
# The tiers and order must be the same on all nodes. Including more tiers
# is better than including fewer. For example:
# locality: country=us,region=us-west,datacenter=us-west-1b,rack=12
# locality: country=ca,region=ca-east,datacenter=ca-east-2,rack=4
# locality: planet=earth,province=manitoba,colo=secondary,power=3
locality: "country=us,region=west,state=washington,city=seattle"
# Run CockroachDB instances in standalone mode with replication disabled
# (replication factor = 1).
# Enabling this option makes the following values to be ignored:
# - `conf.cluster-name`
# - `conf.disable-cluster-name-verification`
# - `conf.join`
#
# WARNING: Enabling this option makes each deployed Pod as a STANDALONE
# CockroachDB instance, so the StatefulSet does NOT FORM A CLUSTER.
# Don't use this option for production deployments unless you clearly
# understand what you're doing.
# Usually, this option is intended to be used in conjunction with
# `statefulset.replicas: 1` for temporary one-time deployments (like
# running E2E tests, for example).
single-node: false
# If non-empty, create a SQL audit log in the specified directory.
sql-audit-dir: ""
# CockroachDB's port to listen to inter-communications and client connections.
port: 26257
# CockroachDB's port to listen to HTTP requests.
http-port: 8080
statefulset:
replicas: 3
updateStrategy:
type: RollingUpdate
podManagementPolicy: Parallel
budget:
maxUnavailable: 1
# List of additional command-line arguments you want to pass to the
# `cockroach start` command.
args:
[]
# - --disable-cluster-name-verification
# List of extra environment variables to pass into container
env:
[]
# - name: COCKROACH_ENGINE_MAX_SYNC_DURATION
# value: "24h"
# List of Secrets names in the same Namespace as the CockroachDB cluster,
# which shall be mounted into `/etc/cockroach/secrets/` for every cluster
# member.
secretMounts: []
# Additional labels to apply to this StatefulSet and all its Pods.
labels:
app.kubernetes.io/component: cockroachdb
# Additional annotations to apply to the Pods of this StatefulSet.
annotations: {}
# Affinity rules for scheduling Pods of this StatefulSet on Nodes.
# https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#node-affinity
nodeAffinity: {}
# Inter-Pod Affinity rules for scheduling Pods of this StatefulSet.
# https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#inter-pod-affinity-and-anti-affinity
podAffinity: {}
# Anti-affinity rules for scheduling Pods of this StatefulSet.
# https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#inter-pod-affinity-and-anti-affinity
# You may either toggle options below for default anti-affinity rules,
# or specify the whole set of anti-affinity rules instead of them.
podAntiAffinity:
# The topologyKey to be used.
# Can be used to spread across different nodes, AZs, regions etc.
topologyKey: kubernetes.io/hostname
# Type of anti-affinity rules: either `soft`, `hard` or empty value (which
# disables anti-affinity rules).
type: hard
# Weight for `soft` anti-affinity rules.
# Does not apply for other anti-affinity types.
weight: 100
# Node selection constraints for scheduling Pods of this StatefulSet.
# https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#nodeselector
nodeSelector: {}
# PriorityClassName given to Pods of this StatefulSet
# https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#priorityclass
priorityClassName: "highest"
# Taints to be tolerated by Pods of this StatefulSet.
# https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
tolerations:
- effect: NoSchedule
key: kubernetes.azure.com/scalesetpriority
operator: Equal
value: spot
# https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/
topologySpreadConstraints:
maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
# Uncomment the following resources definitions or pass them from
# command line to control the CPU and memory resources allocated
# by Pods of this StatefulSet.
resources:
requests:
cpu: 50m
memory: 128Mi
limits:
cpu: 100m
memory: 256Mi
service:
ports:
# You can set a different external and internal gRPC ports and their name.
grpc:
external:
port: 26257
name: grpc
# If the port number is different than `external.port`, then it will be
# named as `internal.name` in Service.
internal:
port: 26257
# If using Istio set it to `cockroach`.
name: cockroach
http:
port: 8080
name: http
# This Service is meant to be used by clients of the database.
# It exposes a ClusterIP that will automatically load balance connections
# to the different database Pods.
public:
type: ClusterIP
# Additional labels to apply to this Service.
labels:
app.kubernetes.io/component: cockroachdb
# Additional annotations to apply to this Service.
annotations: {}
# This service only exists to create DNS entries for each pod in
# the StatefulSet such that they can resolve each other's IP addresses.
# It does not create a load-balanced ClusterIP and should not be used directly
# by clients in most circumstances.
discovery:
# Additional labels to apply to this Service.
labels:
app.kubernetes.io/component: cockroachdb
# Additional annotations to apply to this Service.
annotations: {}
# CockroachDB's ingress for web ui.
ingress:
enabled: false
labels: {}
annotations: {}
# kubernetes.io/ingress.class: nginx
# cert-manager.io/cluster-issuer: letsencrypt
paths: [/]
hosts: []
# - cockroachlabs.com
tls: []
# - hosts: [cockroachlabs.com]
# secretName: cockroachlabs-tls
# CockroachDB's Prometheus operator ServiceMonitor support
serviceMonitor:
enabled: false
labels: {}
annotations: {}
interval: 10s
# scrapeTimeout: 10s
# CockroachDB's data persistence.
# If neither `persistentVolume` nor `hostPath` is used, then data will be
# persisted in ad-hoc `emptyDir`.
storage:
# Absolute path on host to store CockroachDB's data.
# If not specified, then `emptyDir` will be used instead.
# If specified, but `persistentVolume.enabled` is `true`, then has no effect.
hostPath: ""
# If `enabled` is `true` then a PersistentVolumeClaim will be created and
# used to store CockroachDB's data, otherwise `hostPath` is used.
persistentVolume:
enabled: true
size: 10Gi
# If defined, then `storageClassName: <storageClass>`.
# If set to "-", then `storageClassName: ""`, which disables dynamic
# provisioning.
# If undefined or empty (default), then no `storageClassName` spec is set,
# so the default provisioner will be chosen (gp2 on AWS, standard on
# GKE, AWS & OpenStack).
storageClass: "default"
# Additional labels to apply to the created PersistentVolumeClaims.
labels: {}
# Additional annotations to apply to the created PersistentVolumeClaims.
annotations: {}
# Kubernetes Job which initializes multi-node CockroachDB cluster.
# It's not created if `statefulset.replicas` is `1`.
init:
# Additional labels to apply to this Job and its Pod.
labels:
app.kubernetes.io/component: init
# Additional annotations to apply to the Pod of this Job.
annotations: {}
# Affinity rules for scheduling the Pod of this Job.
# https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#node-affinity
affinity: {}
# Node selection constraints for scheduling the Pod of this Job.
# https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#nodeselector
nodeSelector:
"k.com/burstable": "true"
# Taints to be tolerated by the Pod of this Job.
# https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
tolerations:
- effect: NoSchedule
key: kubernetes.azure.com/scalesetpriority
operator: Equal
value: spot
- effect: NoSchedule
key: k.com/burstable
operator: Equal
value: "true"
# The init Pod runs at cluster creation to initialize CockroachDB. It finishes
# quickly and doesn't continue to consume resources in the Kubernetes
# cluster. Normally, you should leave this section commented out, but if your
# Kubernetes cluster uses Resource Quotas and requires all pods to specify
# resource requests or limits, you can set those here.
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "100m"
memory: "128Mi"
# Whether to run securely using TLS certificates.
tls:
enabled: true
serviceAccount:
# Specifies whether this ServiceAccount should be created.
create: true
# The name of this ServiceAccount to use.
# If not set and `create` is `true`, then a name is auto-generated.
name: ""
certs:
# Bring your own certs scenario. If provided, tls.init section will be ignored.
provided: false
# Secret name for the client root cert.
clientRootSecret: cockroachdb-root
# Secret name for node cert.
nodeSecret: cockroachdb-node
# Enable if the secret is a dedicated TLS.
# TLS secrets are created by cert-mananger, for example.
tlsSecret: false
init:
# Image to use for requesting TLS certificates.
image:
repository: cockroachdb/cockroach-k8s-request-cert
tag: "0.4"
pullPolicy: IfNotPresent
credentials:
{}
# registry: docker.io
# username: john_doe
# password: changeme
networkPolicy:
enabled: false
ingress:
# List of sources which should be able to access the CockroachDB Pods via
# gRPC port. Items in this list are combined using a logical OR operation.
# Rules for allowing inter-communication are applied automatically.
# If empty, then connections from any Pod is allowed.
grpc:
[]
# - podSelector:
# matchLabels:
# app.kubernetes.io/name: cockroachdb
# app.kubernetes.io/instance: k-preprod
# List of sources which should be able to access the CockroachDB Pods via
# HTTP port. Items in this list are combined using a logical OR operation.
# If empty, then connections from any Pod is allowed.
http:
[]
# - podSelector:
# matchLabels:
# app.kubernetes.io/name: cockroachdb
# app.kubernetes.io/instance: k-preprod
# - namespaceSelector:
# matchLabels:
# project: my-project
Hello,
I am trying to setup a basic 3 node cluster with minimal changes to helm values. However, all nodes keeps failing with errors like these:
Here is my helm config: