cockroachdb / helm-charts

Helm charts for cockroachdb
Apache License 2.0
85 stars 148 forks source link

Helm install of CockroachDB on DigitalOcean fails #345

Open chokosabe opened 1 year ago

chokosabe commented 1 year ago

Helm install of CockroachDB on Digital Ocean fails

Tried installing Cockroach DB on a digital ocean kubernetes cluster using the helm package. Main change is to use the Digital Ocean storage class StorageClass: 'do-block-storage'.

To Reproduce

helm install cockroachdb on digital ocean

Additional data / screenshots

kubectl describe pods cockroachdb-0 -n cockroachdb

Name: cockroachdb-0 Namespace: cockroachdb Priority: 0 Service Account: cockroachdb Node: staging-yy92h/10.106.0.4 Start Time: Mon, 04 Sep 2023 21:57:32 +0100 Labels: app.kubernetes.io/component=cockroachdb app.kubernetes.io/instance=cockroachdb app.kubernetes.io/name=cockroachdb controller-revision-hash=cockroachdb-695ff69b67 statefulset.kubernetes.io/pod-name=cockroachdb-0 Annotations: Status: Running IP: 10.244.0.93 IPs: IP: 10.244.0.93 Controlled By: StatefulSet/cockroachdb Init Containers: copy-certs: Container ID: containerd://811423a6ff8a550b20b9d9991ad7e9fb9f52bebc99a47d85dba0862150de7866 Image: busybox Image ID: docker.io/library/busybox@sha256:3fbc632167424a6d997e74f52b878d7cc478225cffac6bc977eedfe51c7f4e79 Port: Host Port: Command: /bin/sh -c cp -f /certs/ /cockroach-certs/; chmod 0400 /cockroach-certs/.key State: Terminated Reason: Completed Exit Code: 0 Started: Mon, 04 Sep 2023 21:57:39 +0100 Finished: Mon, 04 Sep 2023 21:57:39 +0100 Ready: True Restart Count: 0 Environment: POD_NAMESPACE: cockroachdb (v1:metadata.namespace) Mounts: /certs/ from certs-secret (rw) /cockroach-certs/ from certs (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-d4c6b (ro) Containers: db: Container ID: containerd://a248855282c32c2e6aaa39b871d1bf5b27c8f9a50e10218bb6cfb31200f0bd43 Image: cockroachdb/cockroach:v23.1.8 Image ID: docker.io/cockroachdb/cockroach@sha256:c02c58d9c6c1ed623369f7b5890ed81f623b50dedd4d1800472016f4b07b9c80 Ports: 26257/TCP, 8080/TCP Host Ports: 0/TCP, 0/TCP Args: shell -ecx exec /cockroach/cockroach start --join=${STATEFULSET_NAME}-0.${STATEFULSET_FQDN}:26257,${STATEFULSET_NAME}-1.${STATEFULSET_FQDN}:26257,${STATEFULSET_NAME}-2.${STATEFULSET_FQDN}:26257 --advertise-host=$(hostname).${STATEFULSET_FQDN} --certs-dir=/cockroach/cockroach-certs/ --http-port=8080 --port=26257 --cache=25% --max-sql-memory=25% --logtostderr=INFO State: Running Started: Mon, 04 Sep 2023 21:57:40 +0100 Ready: False Restart Count: 0 Liveness: http-get https://:http/health delay=30s timeout=1s period=5s #success=1 #failure=3 Readiness: http-get https://:http/health%3Fready=1 delay=10s timeout=1s period=5s #success=1 #failure=2 Environment: STATEFULSET_NAME: cockroachdb STATEFULSET_FQDN: cockroachdb.cockroachdb.svc.cluster.local COCKROACH_CHANNEL: kubernetes-helm Mounts: /cockroach/cockroach-certs/ from certs (rw) /cockroach/cockroach-data/ from datadir (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-d4c6b (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: datadir: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: datadir-cockroachdb-0 ReadOnly: false certs: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: SizeLimit: certs-secret: Type: Projected (a volume that contains injected data from multiple sources) SecretName: cockroachdb-node-secret SecretOptionalName: kube-api-access-d4c6b: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: BestEffort Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Topology Spread Constraints: topology.kubernetes.io/zone:ScheduleAnyway when max skew 1 is exceeded for selector app.kubernetes.io/component=cockroachdb,app.kubernetes.io/instance=cockroachdb,app.kubernetes.io/name=cockroachdb Events: Type Reason Age From Message

Warning FailedScheduling 8m46s default-scheduler 0/3 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod.. Normal Scheduled 8m44s default-scheduler Successfully assigned cockroachdb/cockroachdb-0 to staging-yy92h Normal SuccessfulAttachVolume 8m39s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-78bbba7e-5a3b-43a3-81a8-6e6a2691c826" Normal Pulled 8m38s kubelet Container image "busybox" already present on machine Normal Created 8m38s kubelet Created container copy-certs Normal Started 8m37s kubelet Started container copy-certs Normal Pulled 8m37s kubelet Container image "cockroachdb/cockroach:v23.1.8" already present on machine Normal Created 8m37s kubelet Created container db Normal Started 8m36s kubelet Started container db Warning Unhealthy 3m33s (x63 over 8m23s) kubelet Readiness probe failed: HTTP probe failed with statuscode: 503

LOGS:

kubectl logs cockroachdb-0 --all-containers=true -n cockroachdb

I230904 21:07:54.549571 32 server/init.go:421 ⋮ [T1,n?] 973 ‹cockroachdb-1.cockroachdb.cockroachdb.svc.cluster.local:26257› is itself waiting for init, will retry W230904 21:07:55.528823 7561 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 974 ‹[core]›‹[Channel https://github.com/cockroachdb/cockroach/pull/1849 SubChannel https://github.com/cockroachdb/cockroach/issues/1850] grpc: addrConn.createTransport failed to connect to {› W230904 21:07:55.528823 7561 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 974 +‹ "Addr": "cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257",› W230904 21:07:55.528823 7561 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 974 +‹ "ServerName": "cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257",› W230904 21:07:55.528823 7561 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 974 +‹ "Attributes": null,› W230904 21:07:55.528823 7561 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 974 +‹ "BalancerAttributes": null,› W230904 21:07:55.528823 7561 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 974 +‹ "Type": 0,› W230904 21:07:55.528823 7561 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 974 +‹ "Metadata": null› W230904 21:07:55.528823 7561 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 974 +‹}. Err: connection error: desc = "transport: error while dialing: dial tcp: lookup cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local: no such host"› W230904 21:07:55.529085 32 server/init.go:423 ⋮ [T1,n?] 975 outgoing join rpc to ‹cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: error while dialing: dial tcp: lookup cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local: no such host"› I230904 21:07:56.539170 32 server/init.go:421 ⋮ [T1,n?] 976 ‹cockroachdb-1.cockroachdb.cockroachdb.svc.cluster.local:26257› is itself waiting for init, will retry W230904 21:07:57.527923 7568 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 977 ‹[core]›‹[Channel https://github.com/cockroachdb/cockroach/issues/1855 SubChannel https://github.com/cockroachdb/cockroach/pull/1856] grpc: addrConn.createTransport failed to connect to {› W230904 21:07:57.527923 7568 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 977 +‹ "Addr": "cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257",› W230904 21:07:57.527923 7568 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 977 +‹ "ServerName": "cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257",› W230904 21:07:57.527923 7568 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 977 +‹ "Attributes": null,› W230904 21:07:57.527923 7568 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 977 +‹ "BalancerAttributes": null,› W230904 21:07:57.527923 7568 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 977 +‹ "Type": 0,› W230904 21:07:57.527923 7568 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 977 +‹ "Metadata": null› W230904 21:07:57.527923 7568 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 977 +‹}. Err: connection error: desc = "transport: error while dialing: dial tcp: lookup cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local: no such host"› W230904 21:07:57.528165 32 server/init.go:423 ⋮ [T1,n?] 978 outgoing join rpc to ‹cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: error while dialing: dial tcp: lookup cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local: no such host"› I230904 21:07:58.538910 32 server/init.go:421 ⋮ [T1,n?] 979 ‹cockroachdb-1.cockroachdb.cockroachdb.svc.cluster.local:26257› is itself waiting for init, will retry

udnay commented 1 year ago

Can you attach your values.yml file? The PVC looks to mount fine and the problem is connectivity between the nodes.

chokosabe commented 1 year ago
apiVersion: catalog.cattle.io/v1
kind: App
metadata:
  annotations:
    objectset.rio.cattle.io/applied: >-
      H4sIAAAAAAAA/7xZW2/bOBb+K1oC+yYpviaugQG2TbqD7EzbbNx2gN32gaKOJNYUqSUpO27g/744pCzLtpJJ2mnzJItH5/qdG3NPSrA0pZaS+T2hUipLLVfS4E+VfAFmDdhYcxUzaq2AmKsznpI5KUCUEa0qEj5Ip9YSdJSvlmROzlbDMPiNy/SXBTAN9k8/k7QEMiemiFFSrEEANRCvhjFTbKkVZUWaxKvhkxiZijLk1vmUbEPCNDhr3/MSjKVlReayFiIkgiYgHvVBQU2BDGFywabnYzZK2XhCJxfJJBlM02k2ezFKs+F0Mhi9GMCAobTGoq4O/mWfdiFx2t9CBhokA0Pm/70ntOIfQRuuJJkTZ3siFFu+Q9IrEGDdSUaFgZAwJa1WQoAmc6trCMmSSwxdG4Fn+Lh2QZ+y6XiW0UmUTgYX0eTF+CJKsnEaTacMKGTTGWXnZPt5GxJTAUP/sYJqiw+P4IxRS4XKO95loC3POKDMimorQZOwhy7lphJ0swPL5U7rq1e95Ms6gWjVOvBTPRiM4ZdhPIsGvfSNQ6LeyJ2S1zwyqtYMIg2VInOiqWQF6KgxIXLOME/4NrKbykkUtbGgET2noadVtX8zGsfDeEZCkoJhmlceCV2PBNwENDCMCpoICANT6xVfNc9WK5mLTcSUNNxYkDZY/Pv3ICUhKZQzvrC2MvOzs/V6vUeHoImJmSpJSDhzAndkmq7jnNuiTmoDGqEI0iLpWceL++ezkqKhZyiwpFxayiVoD3ooKRe7iuNd+I8+DY6jhEdk+/mBxPPeRgmtzl7fh5Ukn0OyR89wGA/jKYZmRUUNHsk+XlcKbdjHLxaKUUFcSmYuAaxF43yxYZQVqN9o+nck8Z/sMIcR5QajFHVPEMU844weZDwaElUKE242mA1C8kWhGk1NQx243XimQuVOYSUzjk/bkIBEMWnDbetorDI2Ba3JnFy//ec7F527yPxPRCWUSm9atStqi66PI5frIfHajM5H04uQGC5zAZFUKbQ6Iy9ap9xGKddeN2OVhhM3HaqHzL7C7synCz5vtyHhtMKvT+zhJc0dX6YhBWk5dTV+G5KqFuJGCc7QoOvsrbI3GgxILJKYj4Zbb2w/LkJiaY5p2WQhypK5BmNOq92JozHDjDUnrWfrneoQeobQs6Ihcuy5q6o0y/Bx4+lPJH1Rycvjd/veRqsqxpKoJVgwru6qslIS7Z57Edi2VAoLEMCs0o1aWq04ZgGXeQf1C7CWy7y1BBGQUAMPRhArw94kDW1KohADrNbcbi6xdNzZg4BiN9uGxCoBemdaw6XrPgl2rfRyF9hjRBwEKdcV2+mJabRHU6VVCbaA2vTp8AQ1DegVZw54KTdMrUBvenHx1MgcTDFbn2MdI+4J3FnQkgp8buqIOzpIR4eiXrqofX/4wXbnmpba/dwRYclx6tSJ4Oyvs3CX3uTS4+z6xlnduPWNktwj888Tzdm1QnvJcIB9+AAtu0msUzGMpRayWhiwvQKo3sM9qdPck5X07oOkK8pdpyXzIY6atbGq/J2vQIIxN1ol4Fn4g1ugKT86Abk6nUef5TdM3ZcHFaInmVV6SIIvpOWdl8SqSgmVb34DrIGH4rF4OTS0YTIqw7q5Bp4XlsyHg4Hn+oZKmkMJ0ral9oZqKgQ4oGmuXCIJaszbtvlpqARn1JD5uK9EaLBvVC335fPp+fiSMfyyN65uMYDd2LxrxX0lJ2yds6g00PRSSWM1TjCmgcJiCWsyH4ZHXtz9OormVyXRlesC5AdpqOUm4x5FZMEKSGsBL+VmTTeoTV2l1MLCamoh95HyIbhVQnCZf3DnPlus0k33w4jd+HaNfseBxk18H5WoS3g8kbxDumnj2zAZDga/ctLKcVF0Pts2jeue4FDfPngw6P2ysn93bUwN2jcWDtJegrZXtW5GHXJ+MSrcnLQ7e31Xcb35g8tUrcmcTGZ4nGtVV25B0zYqPeOYK9IuQY2Uh9YylcKR3NnF+aDoHB2JHZ7PCrdTOr1ulbLNmnXAOtJKWbLLw55zNxs1LTbtTjsgsgXPZeMYeqTcZDwbOO38yZFu594njN4cs2W01eLAp8/29xFCSi55WZfHrEaT4tud64qIhzQu7KpGrUelKxJN9fyD4lhExq64a8Sw42Z2aj2lOqxYUxmOJicP5J279iPM5R65H4eXt1dm3z6YqjaXO9Q30ydJarNJ1B05TavDGH/rtJpzY92smrMG8f0TLK5HUWen2s+zESoSGadJhAnUDrjDeEK2+Be6baxdP8fYWzPlRpyD/fPWb9DBpy7IP5Ego1xAOg8sLyENVG2DNeU4PQaZ0oEtIGBKptw6/TOujb2CSqiNuxEYDUbjaPAiGkzejwbz6cV8OPuPa+ZPoZLKYv842IwZlUECAWUMjIE0WHEa4EwTuLknoBY1+iQzJYRao5JXbxdBgztXPNxNAvtbFEXBryCxQ0AaZBxX66t3wdt374PXV9fv42Dh2tc8SGou0jMLZSWoBXN2+/rl1Rvi5w2cM4n3zzfcGnVa5PF9UeUgc+buDpoieKPSK2507cL1yg8wfSKjpHvWf5vWdzvVXjgddNxeCT5bu9D7gdKew1onlMW0toXS/KsrB/Fy5np1V2gzm94qAf32/Szpr7hMcS/7+Uo8bPv3xfY7pf8YUY+6+WfZ+5gS35k9P41n1KyLz2BNq8oc+GLRrGmLB8rXc5gn1LLiMLm0kv9SyY8I9XcLi/xY9pjMz90LylyoxO/7/uq5c31zjeLbi0r/rtnDdm/1EnBxuNGQ8btmP1uCn9LM8YnZGAvljVZfgFnHvIoMY+WUhKTWonNXbCzNuczj5ro8ZkpDDMulit02tD1dxPdb4ehwt+nbZtoNpW9BSVXk/oUSNe8jDZZySdyM0172Drf73nxPVIIrJKRNq3cUg5CYuiypu9rZbrf/DwAA//9VyDv2WRsAAA
    objectset.rio.cattle.io/id: helm-app
    objectset.rio.cattle.io/owner-gvk: /v1, Kind=Secret
    objectset.rio.cattle.io/owner-name: sh.helm.release.v1.cockroachdb.v1
    objectset.rio.cattle.io/owner-namespace: cockroachdb
  creationTimestamp: '2023-09-04T20:57:19Z'
  generation: 2
  labels:
    objectset.rio.cattle.io/hash: ce47c563c2dc34a47b4b05d5f892df1540290e0c
  managedFields:
    - apiVersion: catalog.cattle.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            .: {}
            f:objectset.rio.cattle.io/applied: {}
            f:objectset.rio.cattle.io/id: {}
            f:objectset.rio.cattle.io/owner-gvk: {}
            f:objectset.rio.cattle.io/owner-name: {}
            f:objectset.rio.cattle.io/owner-namespace: {}
          f:labels:
            .: {}
            f:objectset.rio.cattle.io/hash: {}
          f:ownerReferences:
            .: {}
            k:{"uid":"5c538fa4-d407-4937-bf3d-55ceaef58ac6"}: {}
        f:spec:
          .: {}
          f:chart:
            .: {}
            f:metadata:
              .: {}
              f:annotations:
                .: {}
                f:catalog.cattle.io/certified: {}
                f:catalog.cattle.io/display-name: {}
                f:catalog.cattle.io/kube-version: {}
                f:catalog.cattle.io/release-name: {}
                f:catalog.cattle.io/ui-source-repo: {}
                f:catalog.cattle.io/ui-source-repo-type: {}
              f:apiVersion: {}
              f:appVersion: {}
              f:description: {}
              f:home: {}
              f:icon: {}
              f:maintainers: {}
              f:name: {}
              f:sources: {}
              f:version: {}
            f:values:
              .: {}
              f:clusterDomain: {}
              f:conf:
                .: {}
                f:attrs: {}
                f:cache: {}
                f:cluster-name: {}
                f:disable-cluster-name-verification: {}
                f:http-port: {}
                f:join: {}
                f:locality: {}
                f:log:
                  .: {}
                  f:config: {}
                  f:enabled: {}
                f:logtostderr: {}
                f:max-sql-memory: {}
                f:path: {}
                f:port: {}
                f:single-node: {}
                f:sql-audit-dir: {}
                f:store:
                  .: {}
                  f:attrs: {}
                  f:enabled: {}
                  f:size: {}
                  f:type: {}
              f:iap:
                .: {}
                f:enabled: {}
              f:image:
                .: {}
                f:credentials: {}
                f:pullPolicy: {}
                f:repository: {}
                f:tag: {}
              f:ingress:
                .: {}
                f:annotations: {}
                f:enabled: {}
                f:hosts: {}
                f:labels: {}
                f:paths: {}
                f:tls: {}
              f:init:
                .: {}
                f:affinity: {}
                f:annotations: {}
                f:jobAnnotations: {}
                f:labels:
                  .: {}
                  f:app.kubernetes.io/component: {}
                f:nodeSelector: {}
                f:provisioning:
                  .: {}
                  f:clusterSettings: {}
                  f:databases: {}
                  f:enabled: {}
                  f:users: {}
                f:resources: {}
                f:securityContext:
                  .: {}
                  f:enabled: {}
                f:tolerations: {}
              f:labels: {}
              f:networkPolicy:
                .: {}
                f:enabled: {}
                f:ingress:
                  .: {}
                  f:grpc: {}
                  f:http: {}
              f:prometheus:
                .: {}
                f:enabled: {}
              f:securityContext:
                .: {}
                f:enabled: {}
              f:service:
                .: {}
                f:discovery:
                  .: {}
                  f:annotations: {}
                  f:labels:
                    .: {}
                    f:app.kubernetes.io/component: {}
                f:ports:
                  .: {}
                  f:grpc:
                    .: {}
                    f:external:
                      .: {}
                      f:name: {}
                      f:port: {}
                    f:internal:
                      .: {}
                      f:name: {}
                      f:port: {}
                  f:http:
                    .: {}
                    f:name: {}
                    f:port: {}
                f:public:
                  .: {}
                  f:annotations: {}
                  f:labels:
                    .: {}
                    f:app.kubernetes.io/component: {}
                  f:type: {}
              f:serviceMonitor:
                .: {}
                f:annotations: {}
                f:enabled: {}
                f:interval: {}
                f:labels: {}
                f:namespaced: {}
              f:statefulset:
                .: {}
                f:annotations: {}
                f:args: {}
                f:budget:
                  .: {}
                  f:maxUnavailable: {}
                f:customLivenessProbe: {}
                f:customReadinessProbe: {}
                f:env: {}
                f:labels:
                  .: {}
                  f:app.kubernetes.io/component: {}
                f:nodeAffinity: {}
                f:nodeSelector: {}
                f:podAffinity: {}
                f:podAntiAffinity:
                  .: {}
                  f:topologyKey: {}
                  f:type: {}
                  f:weight: {}
                f:podManagementPolicy: {}
                f:priorityClassName: {}
                f:replicas: {}
                f:resources: {}
                f:secretMounts: {}
                f:securityContext:
                  .: {}
                  f:enabled: {}
                f:serviceAccount:
                  .: {}
                  f:annotations: {}
                  f:create: {}
                  f:name: {}
                f:tolerations: {}
                f:topologySpreadConstraints:
                  .: {}
                  f:maxSkew: {}
                  f:topologyKey: {}
                  f:whenUnsatisfiable: {}
                f:updateStrategy:
                  .: {}
                  f:type: {}
              f:storage:
                .: {}
                f:hostPath: {}
                f:persistentVolume:
                  .: {}
                  f:annotations: {}
                  f:enabled: {}
                  f:labels: {}
                  f:size: {}
                  f:storageClass: {}
              f:tls:
                .: {}
                f:certs:
                  .: {}
                  f:certManager: {}
                  f:certManagerIssuer:
                    .: {}
                    f:clientCertDuration: {}
                    f:clientCertExpiryWindow: {}
                    f:group: {}
                    f:kind: {}
                    f:name: {}
                    f:nodeCertDuration: {}
                    f:nodeCertExpiryWindow: {}
                  f:clientRootSecret: {}
                  f:nodeSecret: {}
                  f:provided: {}
                  f:selfSigner:
                    .: {}
                    f:caCertDuration: {}
                    f:caCertExpiryWindow: {}
                    f:caProvided: {}
                    f:caSecret: {}
                    f:clientCertDuration: {}
                    f:clientCertExpiryWindow: {}
                    f:enabled: {}
                    f:minimumCertDuration: {}
                    f:nodeCertDuration: {}
                    f:nodeCertExpiryWindow: {}
                    f:podUpdateTimeout: {}
                    f:readinessWait: {}
                    f:rotateCerts: {}
                    f:securityContext:
                      .: {}
                      f:enabled: {}
                    f:svcAccountAnnotations: {}
                  f:tlsSecret: {}
                  f:useCertManagerV1CRDs: {}
                f:copyCerts:
                  .: {}
                  f:image: {}
                f:enabled: {}
                f:selfSigner:
                  .: {}
                  f:image:
                    .: {}
                    f:credentials: {}
                    f:pullPolicy: {}
                    f:registry: {}
                    f:repository: {}
                    f:tag: {}
          f:helmVersion: {}
          f:info:
            .: {}
            f:description: {}
            f:firstDeployed: {}
            f:lastDeployed: {}
            f:notes: {}
            f:readme: {}
            f:status: {}
          f:name: {}
          f:namespace: {}
          f:resources: {}
          f:values:
            .: {}
            f:global:
              .: {}
              f:cattle:
                .: {}
                f:clusterId: {}
                f:clusterName: {}
                f:rkePathPrefix: {}
                f:rkeWindowsPathPrefix: {}
                f:systemProjectId: {}
                f:url: {}
            f:statefulset:
              .: {}
              f:replicas: {}
            f:storage:
              .: {}
              f:persistentVolume:
                .: {}
                f:size: {}
                f:storageClass: {}
          f:version: {}
      manager: rancher
      operation: Update
      time: '2023-09-04T21:07:30Z'
    - apiVersion: catalog.cattle.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:status:
          .: {}
          f:observedGeneration: {}
          f:summary:
            .: {}
            f:error: {}
            f:state: {}
      manager: rancher
      operation: Update
      subresource: status
      time: '2023-09-04T21:07:30Z'
  name: cockroachdb
  namespace: cockroachdb
  ownerReferences:
    - apiVersion: v1
      blockOwnerDeletion: false
      controller: true
      kind: Secret
      name: sh.helm.release.v1.cockroachdb.v1
      uid: 5c538fa4-d407-4937-bf3d-55ceaef58ac6
  resourceVersion: '16603473'
  uid: c1703c1f-2a46-4d2e-a7c7-ba8bd6610db5
spec:
  chart:
    metadata:
      annotations:
        catalog.cattle.io/certified: partner
        catalog.cattle.io/display-name: CockroachDB
        catalog.cattle.io/kube-version: '>=1.8-0'
        catalog.cattle.io/release-name: cockroachdb
        catalog.cattle.io/ui-source-repo: rancher-partner-charts
        catalog.cattle.io/ui-source-repo-type: cluster
      apiVersion: v1
      appVersion: 23.1.8
      description: CockroachDB is a scalable, survivable, strongly-consistent SQL database.
      home: https://www.cockroachlabs.com
      icon: >-
        https://raw.githubusercontent.com/cockroachdb/cockroach/master/docs/media/cockroach_db.png
      maintainers:
        - email: helm-charts@cockroachlabs.com
          name: cockroachlabs
      name: cockroachdb
      sources:
        - https://github.com/cockroachdb/cockroach
      version: 11.1.5
    values:
      clusterDomain: cluster.local
      conf:
        attrs: null
        cache: 25%
        cluster-name: ''
        disable-cluster-name-verification: false
        http-port: 8080
        join: null
        locality: ''
        log:
          config: {}
          enabled: false
        logtostderr: INFO
        max-sql-memory: 25%
        path: cockroach-data
        port: 26257
        single-node: false
        sql-audit-dir: ''
        store:
          attrs: null
          enabled: false
          size: null
          type: null
      iap:
        enabled: false
      image:
        credentials: {}
        pullPolicy: IfNotPresent
        repository: cockroachdb/cockroach
        tag: v23.1.8
      ingress:
        annotations: {}
        enabled: false
        hosts: null
        labels: {}
        paths:
          - /
        tls: null
      init:
        affinity: {}
        annotations: {}
        jobAnnotations: {}
        labels:
          app.kubernetes.io/component: init
        nodeSelector: {}
        provisioning:
          clusterSettings: null
          databases: null
          enabled: false
          users: null
        resources: {}
        securityContext:
          enabled: true
        tolerations: null
      labels: {}
      networkPolicy:
        enabled: false
        ingress:
          grpc: null
          http: null
      prometheus:
        enabled: true
      securityContext:
        enabled: true
      service:
        discovery:
          annotations: {}
          labels:
            app.kubernetes.io/component: cockroachdb
        ports:
          grpc:
            external:
              name: grpc
              port: 26257
            internal:
              name: grpc-internal
              port: 26257
          http:
            name: http
            port: 8080
        public:
          annotations: {}
          labels:
            app.kubernetes.io/component: cockroachdb
          type: ClusterIP
      serviceMonitor:
        annotations: {}
        enabled: false
        interval: 10s
        labels: {}
        namespaced: false
      statefulset:
        annotations: {}
        args: null
        budget:
          maxUnavailable: 1
        customLivenessProbe: {}
        customReadinessProbe: {}
        env: null
        labels:
          app.kubernetes.io/component: cockroachdb
        nodeAffinity: {}
        nodeSelector: {}
        podAffinity: {}
        podAntiAffinity:
          topologyKey: kubernetes.io/hostname
          type: soft
          weight: 100
        podManagementPolicy: Parallel
        priorityClassName: ''
        replicas: 3
        resources: {}
        secretMounts: null
        securityContext:
          enabled: true
        serviceAccount:
          annotations: {}
          create: true
          name: ''
        tolerations: null
        topologySpreadConstraints:
          maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: ScheduleAnyway
        updateStrategy:
          type: RollingUpdate
      storage:
        hostPath: ''
        persistentVolume:
          annotations: {}
          enabled: true
          labels: {}
          size: 100Gi
          storageClass: ''
      tls:
        certs:
          certManager: false
          certManagerIssuer:
            clientCertDuration: 672h
            clientCertExpiryWindow: 48h
            group: cert-manager.io
            kind: Issuer
            name: cockroachdb
            nodeCertDuration: 8760h
            nodeCertExpiryWindow: 168h
          clientRootSecret: cockroachdb-root
          nodeSecret: cockroachdb-node
          provided: false
          selfSigner:
            caCertDuration: 43800h
            caCertExpiryWindow: 648h
            caProvided: false
            caSecret: ''
            clientCertDuration: 672h
            clientCertExpiryWindow: 48h
            enabled: true
            minimumCertDuration: 624h
            nodeCertDuration: 8760h
            nodeCertExpiryWindow: 168h
            podUpdateTimeout: 2m
            readinessWait: 30s
            rotateCerts: true
            securityContext:
              enabled: true
            svcAccountAnnotations: {}
          tlsSecret: false
          useCertManagerV1CRDs: false
        copyCerts:
          image: busybox
        enabled: true
        selfSigner:
          image:
            credentials: {}
            pullPolicy: IfNotPresent
            registry: gcr.io
            repository: cockroachlabs-helm-charts/cockroach-self-signer-cert
            tag: '1.4'
  helmVersion: 3
  info:
    description: 'Release "cockroachdb" failed: timed out waiting for the condition'
    firstDeployed: '2023-09-04T20:57:18Z'
    lastDeployed: '2023-09-04T20:57:18Z'
    notes: >
    status: failed
  name: cockroachdb
  namespace: cockroachdb
  resources:
    - apiVersion: policy/v1
      kind: PodDisruptionBudget
      name: cockroachdb-budget
      namespace: cockroachdb
    - apiVersion: v1
      kind: ServiceAccount
      name: cockroachdb-rotate-self-signer
      namespace: cockroachdb
    - apiVersion: v1
      kind: ServiceAccount
      name: cockroachdb
      namespace: cockroachdb
    - apiVersion: rbac.authorization.k8s.io/v1
      kind: ClusterRole
      name: cockroachdb-cockroachdb
      namespace: cockroachdb
    - apiVersion: rbac.authorization.k8s.io/v1
      kind: ClusterRoleBinding
      name: cockroachdb-cockroachdb
      namespace: cockroachdb
    - apiVersion: rbac.authorization.k8s.io/v1
      kind: Role
      name: cockroachdb-rotate-self-signer
      namespace: cockroachdb
    - apiVersion: rbac.authorization.k8s.io/v1
      kind: Role
      name: cockroachdb
      namespace: cockroachdb
    - apiVersion: rbac.authorization.k8s.io/v1
      kind: RoleBinding
      name: cockroachdb-rotate-self-signer
      namespace: cockroachdb
    - apiVersion: rbac.authorization.k8s.io/v1
      kind: RoleBinding
      name: cockroachdb
      namespace: cockroachdb
    - apiVersion: v1
      kind: Service
      name: cockroachdb
      namespace: cockroachdb
    - apiVersion: v1
      kind: Service
      name: cockroachdb-public
      namespace: cockroachdb
    - apiVersion: apps/v1
      kind: StatefulSet
      name: cockroachdb
      namespace: cockroachdb
    - apiVersion: batch/v1
      kind: CronJob
      name: cockroachdb-rotate-self-signer
      namespace: cockroachdb
    - apiVersion: batch/v1
      kind: CronJob
      name: cockroachdb-rotate-self-signer-client
      namespace: cockroachdb
  values:
    global:
      cattle:
        clusterId: local
        clusterName: local
        rkePathPrefix: ''
        rkeWindowsPathPrefix: ''
        systemProjectId: p-sccm5
        url: https://staging.rancher.core.ekko.zone
    statefulset:
      replicas: 2
    storage:
      persistentVolume:
        size: 10Gi
        storageClass: do-block-storage-retain
  version: 1
status:
  observedGeneration: 2
  summary:
    error: true
    state: failed

I omitted the notes section. Thanks.

prafull01 commented 1 year ago

I suspect it is related to the storage class. Can you please post the logs of the cockroachdb-init job which we create to initialise the cockroach cluster. As per the logs, it seems that cockroach cluster has not been initialised properly.

chokosabe commented 1 year ago

I reran the install


Filter
Connected
helm upgrade --history-max=5 --install=true --namespace=cockroachdb --timeout=10m0s --values=/home/shell/helm/values-cockroachdb-11.1.5.yaml --version=11.1.5 --wait=true cockroachdb /home/shell/helm/cockroachdb-11.1.5.tgz
2023-09-07T10:24:18.006129300Z creating 1 resource(s)
2023-09-07T10:24:18.071822585Z creating 1 resource(s)
2023-09-07T10:24:18.132963998Z creating 1 resource(s)
2023-09-07T10:24:18.183186790Z creating 1 resource(s)
Watching for changes to Job cockroachdb-self-signer with timeout of 10m0s
2023-09-07T10:24:18.200958442Z Add/Modify event for cockroachdb-self-signer: ADDED
2023-09-07T10:24:18.200987854Z cockroachdb-self-signer: Jobs active: 0, jobs failed: 0, jobs succeeded: 0
2023-09-07T10:24:18.219648093Z Add/Modify event for cockroachdb-self-signer: MODIFIED
2023-09-07T10:24:18.219684737Z cockroachdb-self-signer: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
Add/Modify event for cockroachdb-self-signer: MODIFIED
2023-09-07T10:24:21.127400976Z cockroachdb-self-signer: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
Add/Modify event for cockroachdb-self-signer: MODIFIED
2023-09-07T10:24:23.146250270Z cockroachdb-self-signer: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
Add/Modify event for cockroachdb-self-signer: MODIFIED
2023-09-07T10:24:25.201636018Z cockroachdb-self-signer: Jobs active: 0, jobs failed: 0, jobs succeeded: 0
Add/Modify event for cockroachdb-self-signer: MODIFIED
Starting delete for "cockroachdb-self-signer" ServiceAccount
Starting delete for "cockroachdb-self-signer" Role
Starting delete for "cockroachdb-self-signer" RoleBinding
Starting delete for "cockroachdb-self-signer" Job
checking 14 resources for changes
Patch PodDisruptionBudget "cockroachdb-budget" in namespace cockroachdb
Looks like there are no changes for ServiceAccount "cockroachdb-rotate-self-signer"
Looks like there are no changes for ServiceAccount "cockroachdb"
Looks like there are no changes for ClusterRole "cockroachdb-cockroachdb"
Looks like there are no changes for ClusterRoleBinding "cockroachdb-cockroachdb"
Looks like there are no changes for Role "cockroachdb-rotate-self-signer"
Looks like there are no changes for Role "cockroachdb"
Looks like there are no changes for RoleBinding "cockroachdb-rotate-self-signer"
Looks like there are no changes for RoleBinding "cockroachdb"
Looks like there are no changes for Service "cockroachdb"
Looks like there are no changes for Service "cockroachdb-public"
Patch StatefulSet "cockroachdb" in namespace cockroachdb
Looks like there are no changes for CronJob "cockroachdb-rotate-self-signer"
Looks like there are no changes for CronJob "cockroachdb-rotate-self-signer-client"
beginning wait for 14 resources with timeout of 10m0s
StatefulSet is not ready: cockroachdb/cockroachdb. 0 out of 2 expected pods are ready
StatefulSet is not ready: cockroachdb/cockroachdb. 0 out of 2 expected pods are ready
StatefulSet is not ready: cockroachdb/cockroachdb. 0 out of 2 expected pods are ready
StatefulSet is not ready: cockroachdb/cockroachdb. 0 out of 2 expected pods are ready
StatefulSet is not ready: cockroachdb/cockroachdb. 0 out of 2 expected pods are ready
StatefulSet is not ready: cockroachdb/cockroachdb. 0 out of 2 expected pods are ready
StatefulSet is not ready: cockroachdb/cockroachdb. 0 out of 2 expected pods are ready
StatefulSet is not ready: cockroachdb/cockroachdb. 0 out of 2 expected pods are ready
chokosabe commented 1 year ago

Nodes pretty much cant find each other - Not sure how one would go about extracting the logs on the pvc mount. Can confirm that they are created

W230907 09:58:48.531802 2668705 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329400  ‹[core]›‹[Channel #658673 SubChannel #658674] grpc: addrConn.createTransport failed to connect to {›
2023-09-07T09:58:48.532226279Z W230907 09:58:48.531802 2668705 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329400 +‹  "Addr": "cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257",›
2023-09-07T09:58:48.532237379Z W230907 09:58:48.531802 2668705 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329400 +‹  "ServerName": "cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257",›
2023-09-07T09:58:48.532244204Z W230907 09:58:48.531802 2668705 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329400 +‹  "Attributes": null,›
W230907 09:58:48.531802 2668705 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329400 +‹  "BalancerAttributes": null,›
2023-09-07T09:58:48.532256294Z W230907 09:58:48.531802 2668705 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329400 +‹  "Type": 0,›
2023-09-07T09:58:48.532262394Z W230907 09:58:48.531802 2668705 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329400 +‹  "Metadata": null›
2023-09-07T09:58:48.532269085Z W230907 09:58:48.531802 2668705 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329400 +‹}. Err: connection error: desc = "transport: error while dialing: dial tcp: lookup cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local: no such host"›
2023-09-07T09:58:48.532360045Z W230907 09:58:48.532199 32 server/init.go:423 ⋮ [T1,n?] 329401  outgoing join rpc to ‹cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: error while dialing: dial tcp: lookup cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local: no such host"›
2023-09-07T09:58:49.543679968Z I230907 09:58:49.543437 32 server/init.go:421 ⋮ [T1,n?] 329402  ‹cockroachdb-1.cockroachdb.cockroachdb.svc.cluster.local:26257› is itself waiting for init, will retry
2023-09-07T09:58:50.544994323Z W230907 09:58:50.544689 2668774 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329403  ‹[core]›‹[Channel #658679 SubChannel #658680] grpc: addrConn.createTransport failed to connect to {›
2023-09-07T09:58:50.545035900Z W230907 09:58:50.544689 2668774 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329403 +‹  "Addr": "cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257",›
2023-09-07T09:58:50.545043023Z W230907 09:58:50.544689 2668774 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329403 +‹  "ServerName": "cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257",›
2023-09-07T09:58:50.545049077Z W230907 09:58:50.544689 2
668774 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329403 +‹  "Attributes": null,›
2023-09-07T09:58:50.545054641Z W230907 09:58:50.544689 2668774 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329403 +‹  "BalancerAttributes": null,›
2023-09-07T09:58:50.545058190Z W230907 09:58:50.544689 2668774 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329403 +‹  "Type": 0,›
2023-09-07T09:58:50.545061703Z W230907 09:58:50.544689 2668774 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329403 +‹  "Metadata": null›
2023-09-07T09:58:50.545088516Z W230907 09:58:50.544689 2668774 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329403 +‹}. Err: connection error: desc = "transport: error while dialing: dial tcp: lookup cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local: no such host"›
2023-09-07T09:58:50.545334803Z W230907 09:58:50.545188 32 server/init.go:423 ⋮ [T1,n?] 329404  outgoing join rpc to ‹cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: error while dialing: dial tcp: lookup cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local: no such host"›
2023-09-07T09:58:51.538438515Z I230907 09:58:51.538154 32 server/init.go:421 ⋮ [T1,n?] 329405  ‹cockroachdb-1.cockroachdb.cockroachdb.svc.cluster.local:26257› is itself waiting for init, will retry
2023-09-07T09:58:52.532914153Z W230907 09:58:52.532646 2668791 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329406  ‹[core]›‹[Channel #658685 SubChannel #658686] grpc: addrConn.createTransport failed to connect to {›
2023-09-07T09:58:52.532963931Z W230907 09:58:52.532646 2668791 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329406 +‹  "Addr": "cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257",›
2023-09-07T09:58:52.532971925Z W230907 09:58:52.532646 2668791 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329406 +‹  "ServerName": "cockroachdb-2.cockroachdb
.cockroachdb.svc.cluster.local:26257",›
2023-09-07T09:58:52.532979298Z W230907 09:58:52.532646 2668791 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329406 +‹  "Attributes": null,›
2023-09-07T09:58:52.532986042Z W230907 09:58:52.532646 2668791 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329406 +‹  "BalancerAttributes": null,›
2023-09-07T09:58:52.532991759Z W230907 09:58:52.532646 2668791 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329406 +‹  "Type": 0,›
2023-09-07T09:58:52.532997458Z W230907 09:58:52.532646 2668791 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329406 +‹  "Metadata": null›
W230907 09:58:52.532646 2668791 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329406 +‹}. Err: connection error: desc = "transport: error while dialing: dial tcp: lookup cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local: no such host"›
2023-09-07T09:58:52.533318049Z W230907 09:58:52.533122 32 server/init.go:423 ⋮ [T1,n?] 329407  outgoing join rpc to ‹cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: error while dialing: dial tcp: lookup cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local: no such host"›
2023-09-07T09:58:53.538184549Z I230907 09:58:53.537919 32 server/init.go:421 ⋮ [T1,n?] 329408  ‹cockroachdb-1.cockroachdb.cockroachdb.svc.cluster.local:26257› is itself waiting for init, will retry
2023-09-07T09:58:54.545947851Z W230907 09:58:54.539543 2668821 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329409  ‹[core]›‹[Channel #658691 SubChannel #658692] grpc: addrConn.createTransport failed to connect to {›
2023-09-07T09:58:54.546014610Z W230907 09:58:54.539543 2668821 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329409 +‹  "Addr": "cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257",›
2023-09-07T09:58:54.546026452Z W230907 09:58:54.539543 2668821 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329409 +‹  "ServerName": "cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257",›
2023-09-07T09:58:54.546033924Z W230907 09:58:54.539543 2668821 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329409 +‹  "Attributes": null,›
2023-09-07T09:58:54.546076970Z W230907 09:58:54.539543 2668821 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329409 +‹  "BalancerAttributes": null,›
2023-09-07T09:58:54.546082084Z W230907 09:58:54.539543 2668821 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329409 +‹  "Type": 0,›
2023-09-07T09:58:54.546085985Z W230907 09:58:54.53954
2668821 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329409 +‹  "Metadata": null›
2023-09-07T09:58:54.546090447Z W230907 09:58:54.539543 2668821 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329409 +‹}. Err: connection error: desc = "transport: error while dialing: dial tcp: lookup cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local: no such host"›
2023-09-07T09:58:54.546094922Z W230907 09:58:54.539752 32 server/init.go:423 ⋮ [T1,n?] 329410  outgoing join rpc to ‹cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: error while dialing: dial tcp: lookup cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local: no such host"›
2023-09-07T09:58:55.539818203Z I230907 09:58:55.539570 32 server/init.go:421 ⋮ [T1,n?] 329411  ‹cockroachdb-1.cockroachdb.cockroachdb.svc.cluster.local:26257› is itself waiting for init, will retry
W230907 09:58:56.528408 2668761 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329412  ‹[core]›‹[Channel #658697 SubChannel #658698] grpc: addrConn.createTransport failed to connect to {›
2023-09-07T09:58:56.528755121Z W230907 09:58:56.528408 2668761 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329412 +‹  "Addr": "cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257",›
2023-09-07T09:58:56.528770560Z W230907 09:58:56.528408 2668761 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329412 +‹  "ServerName": "cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257",›
W230907 09:58:56.528408 2668761 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329412 +‹  "Attributes": null,›
W230907 09:58:56.528408 2668761 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329412 +‹  "BalancerAttributes": null,›
2023-09-07T09:58:56.528791234Z W230907 09:58:56.528408 2668761 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329412 +‹  "Type": 0,›
W230907 09:58:56.528408 2668761 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329412 +‹  "Metadata": null›
2023-09-07T09:58:56.528805139Z W230907 09:58:56.528408 2668761 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329412 +‹}. Err: connection error: desc = "transport: error while dialing: dial tcp: lookup cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local: no such host"›
2023-09-07T09:58:56.529014348Z W230907 09:58:56.528892 32 server/init.go:423 ⋮ [T1,n?] 329413  outgoing join rpc to ‹cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: error while dialing: dial tcp: lookup cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local: no such host"›
2023-09-07T09:58:57.539656515Z I230907 09:58:57.539400 32 server/init.go:421 ⋮ [T1,n?] 329414  ‹cockroachdb-1.cockroachdb.cockroachdb.svc.cluster.local:26257› is itself waiting for init, will retry
2023-09-07T09:58:58.527212234Z W230907 09:58:58.526939 2668871 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329415  ‹[core]›‹[Channel #658703 SubChannel #658704] grpc: addrConn.createTransport failed to connect to {›
2023-09-07T09:58:58.527339182Z W230907 09:58:58.526939 2668871 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329415 +‹  "Addr": "cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257",›
2023-09-07T09:58:58.527396601Z W230907 09:58:58.526939 2668871 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329415 +‹  "ServerName": "cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257",›
2023-09-07T09:58:58.527404364Z W230907 09:58:58.526939 2668871 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329415 +‹  "Attributes": null,›
2023-09-07T09:58:58.527409758Z W230907 09:58:58.526939 2668871 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329415 +‹  "BalancerAttributes": null,›
W230907 09:58:58.526939 2668871 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329415 +‹  "Type": 0,›
2023-09-07T09:58:58.527416761Z W230907 09:58:58.526939 2668871 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329415 +‹  "Metadata": null›
W230907 09:58:58.526939 2668871 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329415 +‹}. Err: connection error: desc = "transport: error while dialing: dial tcp: lookup cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local: no such host"›
W230907 09:58:58.527148 32 server/init.go:423 ⋮ [T1,n?] 329416  outgoing join rpc to ‹cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: error while dialing: dial tcp: lookup cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local: no such host"›
2023-09-07T09:58:59.549090247Z I230907 09:58:59.548837 32 server/init.go:421 ⋮ [T1,n?] 329417  ‹cockroachdb-1.cockroachdb.cockroachdb.svc.cluster.local:26257› is itself waiting for init, will retry
2023-09-07T09:59:00.534414566Z W230907 09:59:00.534030 2668887 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329418  ‹[core]›‹[Channel #658709 SubChannel #658710] grpc: addrConn.createTransport failed to connect to {›
W230907 09:59:00.534030 2668887 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329418 +‹  "Addr": "cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257",›
2023-09-07T09:59:00.534577200Z W230907 09:59:00.534030 2668887 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329418 +‹  "ServerName": "cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257",›
W230907 09:59:00.534030 2668887 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329418 +‹  "Attributes": null,›
W230907 09:59:00.534030 2668887 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329418 +‹  "BalancerAttributes": null,›
W230907 09:59:00.534030 2668887 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329418 +‹  "Type": 0,›
2023-09-07T09:59:00.534625816Z W230907 09:59:00.534030 2668887 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329418 +‹  "Metadata": null›
W230907 09:59:00.534030 2668887 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329418 +‹}. Err: connection error: desc = "transport: error while dialing: dial tcp: lookup cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local: no such host"›
2023-09-07T09:59:00.534796903Z W230907 09:59:00.534331 32 server/init.go:423 ⋮ [T1,n?] 329419  outgoing join rpc to ‹cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: error while dialing: dial tcp: lookup cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local: no such host"›
2023-09-07T09:59:01.542034653Z I230907 09:59:01.541725 32 server/init.go:421 ⋮ [T1,n?] 329420  ‹cockroachdb-1.cockroachdb.cockroachdb.svc.cluster.local:26257› is itself waiting for init, will retry
W230907 09:59:02.531748 2668865 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329421  ‹[core]›‹[Channel #658715 SubChannel #658716] grpc: addrConn.createTransport failed to connect to {›
W230907 09:59:02.531748 2668865 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329421 +‹  "Addr": "cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257",›
2023-09-07T09:59:02.532332580Z W230907 09:59:02.531748 2668865 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329421 +‹  "ServerName": "cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257",›
2023-09-07T09:59:02.532337830Z W230907 09:59:02.531748 2668865 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329421 +‹  "Attributes": null,›
W230907 09:59:02.531748 2668865 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329421 +‹  "BalancerAttributes": null,›
W230907 09:59:02.531748 2668865 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329421 +‹  "Type": 0,›
2023-09-07T09:59:02.532348919Z W230907 09:59:02.531748 2668865 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329421 +‹  "Metadata": null›
2023-09-07T09:59:02.532365990Z W230907 09:59:02.531748 2668865 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329421 +‹}. Err: connection error: desc = "transport: error while dialing: dial tcp: lookup cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local: no such host"›
2023-09-07T09:59:02.532482126Z W230907 09:59:02.532316 32 server/init.go:423 ⋮ [T1,n?] 329422  outgoing join rpc to ‹cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257› unsuccessful: ‹rpc error: code = Unavailable desc = connection error: desc = "transport: error while dialing: dial tcp: lookup cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local: no such host"›
2023-09-07T09:59:03.537924557Z I230907 09:59:03.537652 32 server/init.go:421 ⋮ [T1,n?] 329423  ‹cockroachdb-1.cockroachdb.cockroachdb.svc.cluster.local:26257› is itself waiting for init, will retry
2023-09-07T09:59:04.549470464Z W230907 09:59:04.548897 2668913 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329424  ‹[core]›‹[Channel #658721 SubChannel #658722] grpc: addrConn.createTransport failed to connect to {›
2023-09-07T09:59:04.549532737Z W230907 09:59:04.548897 2668913 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329424 +‹  "Addr": "cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257",›
2023-09-07T09:59:04.549544886Z W230907 09:59:04.548897 2668913 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329424 +‹  "ServerName": "cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local:26257",›
2023-09-07T09:59:04.549549992Z W230907 09:59:04.548897 2668913 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329424 +‹  "Attributes": null,›
W230907 09:59:04.548897 2668913 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329424 +‹  "BalancerAttributes": null,›
2023-09-07T09:59:04.549557816Z W230907 09:59:04.548897 2668913 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329424 +‹  "Type": 0,›
2023-09-07T09:59:04.549561553Z W230907 09:59:04.548897 2668913 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329424 +‹  "Metadata": null›
2023-09-07T09:59:04.549567059Z W230907 09:59:04.548897 2668913 google.golang.org/grpc/grpclog/component.go:41 ⋮ [-] 329424 +‹}. Err: connection error: desc = "transport: error while dialing: dial tcp: lookup cockroachdb-2.cockroachdb.cockroachdb.svc.cluster.local: no such host"›
chokosabe commented 1 year ago

Can go over this quickly on Slack as well if that helps - just really surprised no one else really reporting this for one of the major infrastructure providers out there

prafull01 commented 1 year ago

Hi @chokosabe You can find me on cockroachlabs community slack with the name Prafull Ladha.

FuriousGopher commented 4 months ago

Hello everybody. @prafull01 @chokosabe Yesterday i try to start my k8s cluster on DigitalOcean by Argocd using Helm chart folder and i face the same issue Err: connection error: desc = "transport: error while dialing: dial tcp: lookup cockroachdb-.cockroachdb.cockroachdb.svc.cluster.local: no such host"›

Can you help me with this ?