Closed DelaunayAntoine closed 4 months ago
what's the status of your cluster now? I think you have many issues here, can we try to resolve them one by one?
For this InvalidImageName
, have you resolved it now?
@csuzhangxc Thank you very much for your respond and i'm sorry for the delay.
The issue on this screenshot was resolved it was just a problem with the name of my image and the repo for the image. This was not an up to date screenshot
Here is what it look like now :
As you can see now it is in CrashLoopBackOff So the first i did was to describe the pods to look at the events. Here is what it look like : `Events: Type Reason Age From Message
Warning Unhealthy 40m (x1210 over 23h) kubelet Readiness probe failed: dial tcp 192.168.1.74:4000: connect: connection refused Normal Started 10m (x241 over 23h) kubelet Started container tidb Warning BackOff 27s (x5581 over 23h) kubelet Back-off restarting failed container tidb in pod basic-tidb-0_tidb-operator(bdaef247-eb9d-4abc-8128-cc7d02e29a68)`
Its coming back to the internal communication between the different component inside TiDB. I dont know why there seems to be a connection issue. I use different application and they can communicate inside the kubernetes cluster without problem. I think it might be comming from my configuration wich can be bad.
is there any useful information in the TiDB Pods' log?
?
Sorry for the delays,
Here is the logs od the TIDB :
start tidb-server ... /tidb-server --store=tikv --advertise-address=basic-tidb-0.basic-tidb-peer.tidb-operator.svc --host=0.0.0.0 --path=basic-pd:2379 --config=/etc/tidb/tidb.toml --log-slow-query=/var/log/tidb/slowlog [2024/07/15 06:00:12.426 +00:00] [INFO] [cgroup_cpu_linux.go:96] ["TiDB runs in a container, mount info: 3734 3606 0:416 / / rw,relatime master:976 - overlay overlay rw,lowerdir=/var/lib/rancher/rke2/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/22170/fs:/var/lib/rancher/rke2/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/22169/fs:/var/lib/rancher/rke2/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/22168/fs:/var/lib/rancher/rke2/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/22167/fs:/var/lib/rancher/rke2/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/22166/fs,upperdir=/var/lib/rancher/rke2/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/29592/fs,workdir=/var/lib/rancher/rke2/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/29592/work"] [2024/07/15 06:00:12.441 +00:00] [INFO] [printer.go:47] ["Welcome to TiDB."] ["Release Version"=v8.1.0] [Edition=Community] ["Git Commit Hash"=945d07c5d5c7a1ae212f6013adfb187f2de24b23] ["Git Branch"=HEAD] ["UTC Build Time"="2024-05-21 03:51:57"] [GoVersion=go1.21.10] ["Race Enabled"=false] ["Check Table Before Drop"=false] [2024/07/15 06:00:12.441 +00:00] [INFO] [cgmon.go:130] ["set the maxprocs"] [quota=8] [2024/07/15 06:00:12.442 +00:00] [INFO] [printer.go:52] ["loaded config"] [config="{\"host\":\"0.0.0.0\",\"advertise-address\":\"basic-tidb-0.basic-tidb-peer.tidb-operator.svc\",\"port\":4000,\"cors\":\"\",\"store\":\"tikv\",\"path\":\"basic-pd:2379\",\"socket\":\"/tmp/tidb-4000.sock\",\"lease\":\"45s\",\"split-table\":true,\"token-limit\":1000,\"temp-dir\":\"/tmp/tidb\",\"tmp-storage-path\":\"/tmp/0_tidb/MC4wLjAuMDo0MDAwLzAuMC4wLjA6MTAwODA=/tmp-storage\",\"tmp-storage-quota\":-1,\"server-version\":\"\",\"version-comment\":\"\",\"tidb-edition\":\"\",\"tidb-release-version\":\"\",\"keyspace-name\":\"\",\"log\":{\"level\":\"error\",\"format\":\"text\",\"disable-timestamp\":null,\"enable-timestamp\":null,\"disable-error-stack\":null,\"enable-error-stack\":null,\"file\":{\"filename\":\"\",\"max-size\":300,\"max-days\":0,\"max-backups\":3,\"compression\":\"\"},\"slow-query-file\":\"/var/log/tidb/slowlog\",\"expensive-threshold\":10000,\"general-log-file\":\"\",\"query-log-max-len\":4096,\"enable-slow-log\":true,\"slow-threshold\":300,\"record-plan-in-slow-log\":1,\"timeout\":0},\"instance\":{\"tidb_general_log\":false,\"tidb_pprof_sql_cpu\":false,\"ddl_slow_threshold\":300,\"tidb_expensive_query_time_threshold\":60,\"tidb_expensive_txn_time_threshold\":600,\"tidb_stmt_summary_enable_persistent\":false,\"tidb_stmt_summary_filename\":\"tidb-statements.log\",\"tidb_stmt_summary_file_max_days\":3,\"tidb_stmt_summary_file_max_size\":64,\"tidb_stmt_summary_file_max_backups\":0,\"tidb_enable_slow_log\":true,\"tidb_slow_log_threshold\":300,\"tidb_record_plan_in_slow_log\":1,\"tidb_check_mb4_value_in_utf8\":true,\"tidb_force_priority\":\"NO_PRIORITY\",\"tidb_memory_usage_alarm_ratio\":0.8,\"tidb_enable_collect_execution_info\":true,\"plugin_dir\":\"/data/deploy/plugin\",\"plugin_load\":\"\",\"max_connections\":0,\"tidb_enable_ddl\":true,\"tidb_rc_read_check_ts\":false,\"tidb_service_scope\":\"\"},\"security\":{\"skip-grant-table\":true,\"ssl-ca\":\"\",\"ssl-cert\":\"\",\"ssl-key\":\"\",\"cluster-ssl-ca\":\"\",\"cluster-ssl-cert\":\"\",\"cluster-ssl-key\":\"\",\"cluster-verify-cn\":null,\"session-token-signing-cert\":\"\",\"session-token-signing-key\":\"\",\"spilled-file-encryption-method\":\"plaintext\",\"enable-sem\":false,\"auto-tls\":false,\"tls-version\":\"\",\"rsa-key-size\":4096,\"secure-bootstrap\":false,\"auth-token-jwks\":\"\",\"auth-token-refresh-interval\":\"1h0m0s\",\"disconnect-on-expired-password\":true},\"status\":{\"status-host\":\"0.0.0.0\",\"metrics-addr\":\"\",\"status-port\":10080,\"metrics-interval\":15,\"report-status\":true,\"record-db-qps\":false,\"record-db-label\":false,\"grpc-keepalive-time\":10,\"grpc-keepalive-timeout\":3,\"grpc-concurrent-streams\":1024,\"grpc-initial-window-size\":2097152,\"grpc-max-send-msg-size\":2147483647},\"performance\":{\"max-procs\":0,\"max-memory\":0,\"server-memory-quota\":0,\"stats-lease\":\"3s\",\"stmt-count-limit\":5000,\"pseudo-estimate-ratio\":0.8,\"bind-info-lease\":\"3s\",\"txn-entry-size-limit\":6291456,\"txn-total-size-limit\":104857600,\"tcp-keep-alive\":true,\"tcp-no-delay\":true,\"cross-join\":true,\"distinct-agg-push-down\":false,\"projection-push-down\":false,\"max-txn-ttl\":3600000,\"index-usage-sync-lease\":\"\",\"plan-replayer-gc-lease\":\"10m\",\"gogc\":100,\"enforce-mpp\":false,\"stats-load-concurrency\":5,\"stats-load-queue-size\":1000,\"analyze-partition-concurrency-quota\":16,\"plan-replayer-dump-worker-concurrency\":1,\"enable-stats-cache-mem-quota\":true,\"committer-concurrency\":128,\"run-auto-analyze\":true,\"force-priority\":\"NO_PRIORITY\",\"memory-usage-alarm-ratio\":0.8,\"enable-load-fmsketch\":false,\"lite-init-stats\":true,\"force-init-stats\":true,\"concurrently-init-stats\":false},\"prepared-plan-cache\":{\"enabled\":true,\"capacity\":100,\"memory-guard-ratio\":0.1},\"opentracing\":{\"enable\":false,\"rpc-metrics\":false,\"sampler\":{\"type\":\"const\",\"param\":1,\"sampling-server-url\":\"\",\"max-operations\":0,\"sampling-refresh-interval\":0},\"reporter\":{\"queue-size\":0,\"buffer-flush-interval\":0,\"log-spans\":false,\"local-agent-host-port\":\"\"}},\"proxy-protocol\":{\"networks\":\"\",\"header-timeout\":5,\"fallbackable\":false},\"pd-client\":{\"pd-server-timeout\":3},\"tikv-client\":{\"grpc-connection-count\":4,\"grpc-keepalive-time\":10,\"grpc-keepalive-timeout\":3,\"grpc-compression-type\":\"none\",\"grpc-shared-buffer-pool\":false,\"grpc-initial-window-size\":134217728,\"grpc-initial-conn-window-size\":134217728,\"commit-timeout\":\"41s\",\"async-commit\":{\"keys-limit\":256,\"total-key-size-limit\":4096,\"safe-window\":2000000000,\"allowed-clock-drift\":500000000},\"max-batch-size\":128,\"overload-threshold\":200,\"max-batch-wait-time\":0,\"batch-wait-size\":8,\"enable-chunk-rpc\":true,\"region-cache-ttl\":600,\"store-limit\":0,\"store-liveness-timeout\":\"1s\",\"copr-cache\":{\"capacity-mb\":1000},\"copr-req-timeout\":60000000000,\"ttl-refreshed-txn-size\":33554432,\"resolve-lock-lite-threshold\":16,\"max-concurrency-request-limit\":9223372036854775807,\"enable-replica-selector-v2\":true},\"binlog\":{\"enable\":false,\"ignore-error\":false,\"write-timeout\":\"15s\",\"binlog-socket\":\"\",\"strategy\":\"range\"},\"compatible-kill-query\":false,\"pessimistic-txn\":{\"max-retry-count\":256,\"deadlock-history-capacity\":10,\"deadlock-history-collect-retryable\":false,\"pessimistic-auto-commit\":false,\"constraint-check-in-place-pessimistic\":true},\"max-index-length\":3072,\"index-limit\":64,\"table-column-count-limit\":1017,\"graceful-wait-before-shutdown\":0,\"alter-primary-key\":false,\"treat-old-version-utf8-as-utf8mb4\":true,\"enable-table-lock\":false,\"delay-clean-table-lock\":0,\"split-region-max-num\":1000,\"top-sql\":{\"receiver-address\":\"\"},\"repair-mode\":false,\"repair-table-list\":[],\"isolation-read\":{\"engines\":[\"tikv\",\"tiflash\",\"tidb\"]},\"new_collations_enabled_on_first_bootstrap\":true,\"experimental\":{\"allow-expression-index\":false},\"skip-register-to-dashboard\":false,\"enable-telemetry\":false,\"labels\":{},\"enable-global-index\":false,\"deprecate-integer-display-length\":false,\"enable-enum-length-limit\":true,\"stores-refresh-interval\":60,\"enable-tcp4-only\":false,\"enable-forwarding\":false,\"max-ballast-object-size\":0,\"ballast-object-size\":0,\"transaction-summary\":{\"transaction-summary-capacity\":500,\"transaction-id-digest-min-duration\":2147483647},\"enable-global-kill\":true,\"enable-32bits-connection-id\":true,\"initialize-sql-file\":\"\",\"enable-batch-dml\":false,\"mem-quota-query\":1073741824,\"oom-action\":\"cancel\",\"oom-use-tmp-storage\":true,\"check-mb4-value-in-utf8\":true,\"enable-collect-execution-info\":true,\"plugin\":{\"dir\":\"/data/deploy/plugin\",\"load\":\"\"},\"max-server-connections\":0,\"run-ddl\":true,\"disaggregated-tiflash\":false,\"autoscaler-type\":\"aws\",\"autoscaler-addr\":\"tiflash-autoscale-lb.tiflash-autoscale.svc.cluster.local:8081\",\"is-tiflashcompute-fixed-pool\":false,\"autoscaler-cluster-id\":\"\",\"use-autoscaler\":false,\"tidb-max-reuse-chunk\":64,\"tidb-max-reuse-column\":256,\"tidb-enable-exit-check\":false,\"in-mem-slow-query-topn-num\":30,\"in-mem-slow-query-recent-num\":500}"] [2024/07/15 06:01:02.829 +00:00] [FATAL] [terror.go:309] ["unexpected error"] [error="[tikv:9005]Region is unavailable"] [stack="github.com/pingcap/tidb/pkg/parser/terror.MustNil\n\t/workspace/source/tidb/pkg/parser/terror/terror.go:309\nmain.createStoreAndDomain\n\t/workspace/source/tidb/cmd/tidb-server/main.go:421\nmain.main\n\t/workspace/source/tidb/cmd/tidb-server/main.go:326\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:267"] [stack="github.com/pingcap/tidb/pkg/parser/terror.MustNil\n\t/workspace/source/tidb/pkg/parser/terror/terror.go:309\nmain.createStoreAndDomain\n\t/workspace/source/tidb/cmd/tidb-server/main.go:421\nmain.main\n\t/workspace/source/tidb/cmd/tidb-server/main.go:326\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:267"]
As far as I can make out, it's because he can't find a TIKV region. This seems normal to me, as tikv can't start up either because it can't connect to the PD endpoint.
If we resolve the internal addressing between PD and TIKV so that it can recognize each other. We resolve the endpoint between PD and tikv.
But I can't find a solution with PD for TIKV to connect to the PD endpoint.
Can you show the log of TiKV again?
Of course :
try the method in #5372 (comment)
Thanks, im trying this but now the pd pods show fatal error :
domain resolve basic-pd-0.basic-pd-peer.tidb-operator.svc success 192.168.1.71 hostIps: 192.168.1.71 resolvedIps: 192.168.1.71 Success: Resolved IP matches one of podIPs starting pd-server ... /pd-server services api --data-dir=/var/lib/pd --name=basic-pd-0 --peer-urls=http://0.0.0.0:2380 --advertise-peer-urls=http://basic-pd-0.basic-pd-peer.tidb-operator.svc:2380 --client-urls=http://0.0.0.0:2379 --advertise-client-urls=http://basic-pd-0.basic-pd-peer.tidb-operator.svc:2379 --config=/etc/pd/pd.toml --join=http://basic-pd-1.basic-pd-peer.tidb-operator.svc:2380,http://basic-pd-0.basic-pd-peer.tidb-operator.svc:2380 [2024/07/15 09:05:38.592 +00:00] [INFO] [meminfo.go:213] ["use physical memory hook"] [cgroupMemorySize=9223372036854775807] [physicalMemorySize=16765227008] [2024/07/15 09:05:38.592 +00:00] [INFO] [versioninfo.go:98] ["Welcome to Placement Driver (API SERVICE)"] [2024/07/15 09:05:38.592 +00:00] [INFO] [versioninfo.go:99] ["API SERVICE"] [release-version=v8.1.0] [2024/07/15 09:05:38.592 +00:00] [INFO] [versioninfo.go:100] ["API SERVICE"] [edition=Community] [2024/07/15 09:05:38.592 +00:00] [INFO] [versioninfo.go:101] ["API SERVICE"] [git-hash=fca469ca33eb5d8b5e0891b507c87709a00b0e81] [2024/07/15 09:05:38.592 +00:00] [INFO] [versioninfo.go:102] ["API SERVICE"] [git-branch=HEAD] [2024/07/15 09:05:38.592 +00:00] [INFO] [versioninfo.go:103] ["API SERVICE"] [utc-build-time="2024-05-09 02:15:45"] [2024/07/15 09:05:38.592 +00:00] [INFO] [metricutil.go:86] ["disable Prometheus push client"] [2024/07/15 09:05:38.595 +00:00] [INFO] [server.go:255] ["API Service config"] [config="{\"client-urls\":\"http://0.0.0.0:2379\",\"peer-urls\":\"http://0.0.0.0:2380\",\"advertise-client-urls\":\"http://basic-pd-0.basic-pd-peer.tidb-operator.svc:2379\",\"advertise-peer-urls\":\"http://basic-pd-0.basic-pd-peer.tidb-operator.svc:2380\",\"name\":\"basic-pd-0\",\"data-dir\":\"/var/lib/pd\",\"force-new-cluster\":false,\"enable-grpc-gateway\":true,\"initial-cluster\":\"basic-pd-1=http://basic-pd-1.basic-pd-peer.tidb-operator.svc:2380,basic-pd-0=http://basic-pd-0.basic-pd-peer.tidb-operator.svc:2380\",\"initial-cluster-state\":\"existing\",\"initial-cluster-token\":\"pd-cluster\",\"join\":\"http://basic-pd-1.basic-pd-peer.tidb-operator.svc:2380,http://basic-pd-0.basic-pd-peer.tidb-operator.svc:2380\",\"lease\":3,\"log\":{\"level\":\"info\",\"format\":\"text\",\"disable-timestamp\":false,\"file\":{\"filename\":\"\",\"max-size\":0,\"max-days\":0,\"max-backups\":0},\"development\":false,\"disable-caller\":false,\"disable-stacktrace\":false,\"disable-error-verbose\":true,\"sampling\":null,\"error-output-path\":\"\"},\"max-concurrent-tso-proxy-streamings\":5000,\"tso-proxy-recv-from-client-timeout\":\"1h0m0s\",\"tso-save-interval\":\"3s\",\"tso-update-physical-interval\":\"50ms\",\"enable-local-tso\":false,\"metric\":{\"job\":\"basic-pd-0\",\"address\":\"\",\"interval\":\"15s\"},\"schedule\":{\"max-snapshot-count\":64,\"max-pending-peer-count\":64,\"max-merge-region-size\":20,\"max-merge-region-keys\":0,\"split-merge-interval\":\"1h0m0s\",\"switch-witness-interval\":\"1h0m0s\",\"enable-one-way-merge\":\"false\",\"enable-cross-table-merge\":\"true\",\"patrol-region-interval\":\"10ms\",\"max-store-down-time\":\"30m0s\",\"max-store-preparing-time\":\"48h0m0s\",\"leader-schedule-limit\":4,\"leader-schedule-policy\":\"count\",\"region-schedule-limit\":2048,\"witness-schedule-limit\":4,\"replica-schedule-limit\":64,\"merge-schedule-limit\":8,\"hot-region-schedule-limit\":4,\"hot-region-cache-hits-threshold\":3,\"store-limit\":{},\"tolerant-size-ratio\":0,\"low-space-ratio\":0.8,\"high-space-ratio\":0.7,\"region-score-formula-version\":\"v2\",\"scheduler-max-waiting-operator\":5,\"enable-remove-down-replica\":\"true\",\"enable-replace-offline-replica\":\"true\",\"enable-make-up-replica\":\"true\",\"enable-remove-extra-replica\":\"true\",\"enable-location-replacement\":\"true\",\"enable-debug-metrics\":\"false\",\"enable-joint-consensus\":\"true\",\"enable-tikv-split-region\":\"true\",\"enable-heartbeat-breakdown-metrics\":\"true\",\"schedulers-v2\":[{\"type\":\"balance-region\",\"args\":null,\"disable\":false,\"args-payload\":\"\"},{\"type\":\"balance-leader\",\"args\":null,\"disable\":false,\"args-payload\":\"\"},{\"type\":\"hot-region\",\"args\":null,\"disable\":false,\"args-payload\":\"\"},{\"type\":\"evict-slow-store\",\"args\":null,\"disable\":false,\"args-payload\":\"\"}],\"schedulers-payload\":null,\"hot-regions-write-interval\":\"10m0s\",\"hot-regions-reserved-days\":7,\"max-movable-hot-peer-size\":512,\"enable-diagnostic\":\"true\",\"enable-witness\":\"false\",\"slow-store-evicting-affected-store-ratio-threshold\":0.3,\"store-limit-version\":\"v1\"},\"replication\":{\"max-replicas\":3,\"location-labels\":\"\",\"strictly-match-label\":\"false\",\"enable-placement-rules\":\"true\",\"enable-placement-rules-cache\":\"false\",\"isolation-level\":\"\"},\"pd-server\":{\"use-region-storage\":\"true\",\"max-gap-reset-ts\":\"24h0m0s\",\"key-type\":\"table\",\"runtime-services\":\"\",\"metric-storage\":\"\",\"dashboard-address\":\"auto\",\"flow-round-by-digit\":3,\"min-resolved-ts-persistence-interval\":\"1s\",\"server-memory-limit\":0,\"server-memory-limit-gc-trigger\":0.7,\"enable-gogc-tuner\":\"false\",\"gc-tuner-threshold\":0.6,\"block-safe-point-v1\":\"false\"},\"cluster-version\":\"0.0.0\",\"labels\":{},\"quota-backend-bytes\":\"8GiB\",\"auto-compaction-mode\":\"periodic\",\"auto-compaction-retention-v2\":\"1h\",\"TickInterval\":\"500ms\",\"ElectionInterval\":\"3s\",\"PreVote\":true,\"max-request-bytes\":157286400,\"security\":{\"cacert-path\":\"\",\"cert-path\":\"\",\"key-path\":\"\",\"cert-allowed-cn\":null,\"SSLCABytes\":null,\"SSLCertBytes\":null,\"SSLKEYBytes\":null,\"redact-info-log\":false,\"encryption\":{\"data-encryption-method\":\"plaintext\",\"data-key-rotation-period\":\"168h0m0s\",\"master-key\":{\"type\":\"plaintext\",\"key-id\":\"\",\"region\":\"\",\"endpoint\":\"\",\"path\":\"\"}}},\"label-property\":null,\"WarningMsgs\":null,\"DisableStrictReconfigCheck\":false,\"HeartbeatStreamBindInterval\":\"1m0s\",\"LeaderPriorityCheckInterval\":\"1m0s\",\"dashboard\":{\"tidb-cacert-path\":\"\",\"tidb-cert-path\":\"\",\"tidb-key-path\":\"\",\"public-path-prefix\":\"\",\"internal-proxy\":false,\"enable-telemetry\":false,\"enable-experimental\":false},\"replication-mode\":{\"replication-mode\":\"majority\",\"dr-auto-sync\":{\"label-key\":\"\",\"primary\":\"\",\"dr\":\"\",\"primary-replicas\":0,\"dr-replicas\":0,\"wait-store-timeout\":\"1m0s\",\"wait-recover-timeout\":\"0s\",\"pause-region-split\":\"false\"}},\"keyspace\":{\"pre-alloc\":null,\"wait-region-split\":true,\"wait-region-split-timeout\":\"30s\",\"check-region-split-interval\":\"50ms\"},\"micro-service\":{\"enable-scheduling-fallback\":\"true\"},\"controller\":{\"degraded-mode-wait-duration\":\"0s\",\"ltb-max-wait-duration\":\"30s\",\"request-unit\":{\"read-base-cost\":0.125,\"read-per-batch-base-cost\":0.5,\"read-cost-per-byte\":0.0000152587890625,\"write-base-cost\":1,\"write-per-batch-base-cost\":1,\"write-cost-per-byte\":0.0009765625,\"read-cpu-ms-cost\":0.3333333333333333},\"enable-controller-trace-log\":\"false\"}}"] [2024/07/15 09:05:38.604 +00:00] [INFO] [apiutil.go:413] ["register REST path"] [path=/pd/api/v1] [2024/07/15 09:05:38.604 +00:00] [INFO] [apiutil.go:413] ["register REST path"] [path=/pd/api/v2/] [2024/07/15 09:05:38.604 +00:00] [INFO] [apiutil.go:413] ["register REST path"] [path=/autoscaling] [2024/07/15 09:05:38.605 +00:00] [INFO] [distro.go:51] ["using distribution strings"] [strings={}] [2024/07/15 09:05:38.608 +00:00] [INFO] [apiutil.go:413] ["register REST path"] [path=/dashboard/api/] [2024/07/15 09:05:38.608 +00:00] [INFO] [apiutil.go:413] ["register REST path"] [path=/dashboard/] [2024/07/15 09:05:38.608 +00:00] [INFO] [registry.go:92] ["restful API service registered successfully"] [prefix=basic-pd-0] [service-name=MetaStorage] [2024/07/15 09:05:38.609 +00:00] [INFO] [apiutil.go:413] ["register REST path"] [path=/resource-manager/api/v1/] [2024/07/15 09:05:38.609 +00:00] [INFO] [registry.go:92] ["restful API service registered successfully"] [prefix=basic-pd-0] [service-name=ResourceManager] [2024/07/15 09:05:38.610 +00:00] [WARN] [config.go:622] ["Running http and grpc server on single port. This is not recommended for production."] [2024/07/15 09:05:38.610 +00:00] [INFO] [etcd.go:120] ["configuring peer listeners"] [listen-peer-urls="[http://0.0.0.0:2380]"] [2024/07/15 09:05:38.610 +00:00] [INFO] [systimemon.go:30] ["start system time monitor"] [2024/07/15 09:05:38.611 +00:00] [ERROR] [etcd.go:543] ["creating peer listener failed"] [error="listen tcp 0.0.0.0:2380: bind: address already in use"] [2024/07/15 09:05:38.611 +00:00] [INFO] [etcd.go:375] ["closing etcd server"] [name=basic-pd-0] [data-dir=/var/lib/pd] [advertise-peer-urls="[http://basic-pd-0.basic-pd-peer.tidb-operator.svc:2380]"] [advertise-client-urls="[http://basic-pd-0.basic-pd-peer.tidb-operator.svc:2379]"] [2024/07/15 09:05:38.611 +00:00] [INFO] [etcd.go:379] ["closed etcd server"] [name=basic-pd-0] [data-dir=/var/lib/pd] [advertise-peer-urls="[http://basic-pd-0.basic-pd-peer.tidb-operator.svc:2380]"] [advertise-client-urls="[http://basic-pd-0.basic-pd-peer.tidb-operator.svc:2379]"] [2024/07/15 09:05:38.611 +00:00] [FATAL] [main.go:282] ["run server failed"] [error="[PD:etcd:ErrStartEtcd]listen tcp 0.0.0.0:2380: bind: address already in use: listen tcp 0.0.0.0:2380: bind: address already in use"] [stack="main.start\n\t/workspace/source/pd/cmd/pd-server/main.go:282\nmain.createAPIServerWrapper\n\t/workspace/source/pd/cmd/pd-server/main.go:183\ngithub.com/spf13/cobra.(*Command).execute\n\t/root/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:987\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/root/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1115\ngithub.com/spf13/cobra.(*Command).Execute\n\t/root/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1039\nmain.main\n\t/workspace/source/pd/cmd/pd-server/main.go:71\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:267"]
listen tcp 0.0.0.0:2380: bind: address already in use
Are you using hostNetwork and have more than one PD pods on a single node?
listen tcp 0.0.0.0:2380: bind: address already in use
Are you using hostNetwork and have more than one PD pods on a single node?
I use hostNetwork but use 3 Master and 6 worker. The deployment is automatised i dont particulary touch anything for the assignement of the pod.
I didnt get this error before wich is why its suprising to get one now.
bind: address already in use
It means another process is using the 2380 port (may used by another PD process or anything else).
How many TidbCluster are there in your K8s cluster?
Cloud you delete this TidbCluster and re-deploy again?
Hi guys, sorry for the delays i was trying different things.
Cloud you delete this TidbCluster and re-deploy again?
I tried to do this but it didn't work because my pods always gave the error bind address already in use.
So I tried to modify my dploiement manifest with the comment GRPC_DNS_RESOLVER: native
by deploying only 3 PD pods, 3 TIKV pods and 3 TIDB pods.
As a result, the cluster launched without a hitch.
TiDB launched directly, PD found addresses to bind to and TIKV had no problem launching.
With this in mind, I decided to add TiFlash and TICDC (3 pods each).
TiCDC launched correctly and seemed to find PD quickly. Here are the logs : ticdc(1).txt ticdc.txt
But TiFLASH failed to find an endpoint for PD. Here are the logs: tiflash(3).txt
So I'm going to continue my research to get it working properly while waiting for your feedback, but thank you very much for your help, I'm making a lot of progress.
The error of TiFlash is similar with TiKV, so I think you can try to add GRPC_DNS_RESOLVER: native
for TiFlash.
The error of TiFlash is similar with TiKV, so I think you can try to add
GRPC_DNS_RESOLVER: native
for TiFlash.
Yes they were the same actually, with adding GRPC_DNS_RESOLVER: native
it resolved the problem.
I guessed all my problem are gone now with your help thank you very much. If i have any more question i will ask in this channel.
How about closing this issue and open a new one if you have another questions?
How about closing this issue and open a new one if you have another questions?
Yeah Sure no problem with that
Bug Report
What version of Kubernetes are you using? Client 1.22.6 Server 1.26.1
What version of TiDB Operator are you using? TiDB Operator 1.6.0
What storage classes exist in the Kubernetes cluster and what are used for PD/TiKV pods? Nfs-client (custom storage class)
What's the status of the TiDB cluster pods?
What did you do? Deploy the tidb-operator with advanced statefulset enabled :
Deploy a basic tidb cluster with pd as ms :
What did you expect to see? I expect my TiDBs to build straight away, or to understand why it takes so long to get started. Understanding why my pods can't communicate
What did you see instead? I've seen my TiDB pods launch two days later or never launch at all. I've seen my pods never communicate with each other.
Hello, I may have done a few things wrong with my deployment, please don't hesitate to correct me. I'm currently trying to deploy TiDB on kubernetes with the tidb-operator but I'm encountering several problems and I can't solve them. So I'm turning to you for help. I'm encountering two problems:
The first is that the TiDB pods don't want to launch when the TiDB cluster starts up. So I have no way of interacting with them. To get them to launch correctly, I have to wait two full days for the installation to work This doesn't seem optimal to me, nor does it match your quickstart on your documentation. I really don't understand why TiDB pods react the way they do to a basic configuration.
Secondly, I have a problem with the internal configuration of the pods, which can't communicate with each other at all, and this is a real problem for debugging. My pd pods can't interact with my Tikv pods. I've tried debugging with this link https://docs.pingcap.com/tidb-in-kubernetes/stable/network-issues but can't find a solution except that my pods return "connection refused" when I try to curl their internal ip address.
In the following example, I'm also trying to deploy PD as a microservice (perhaps my deployment isn't right), but I've also tried without setting PD as a microservice and that didn't work either.
For operator deployment, I use the chart you provided. I also work offline, so I have to download my images and charts.
In the next section you will find the logs of my various components :
Advanced-StateFulSet logs : advanced-statefulset-controller.txt
Controller-Manager logs : tidb-operator.txt
Discovery logs : discovery.txt
PD logs (for the 3 pods) : pd(1).txt pd(2).txt pd.txt
Tso logs : tso(1).txt tso.txt
Scheduling logs : scheduling.txt
TiKV logs (for the 3 pods): tikv(1).txt tikv(2).txt tikv.txt
TiFlash logs: tiflash(1).txt tiflash(2).txt tiflash.txt
TiProxy logs : tiproxy(1).txt tiproxy(2).txt tiproxy.txt
Here is my manifest for my deployment :
For tidb-operator : values-tidb-operator.txt
For my cluster : basic-deploy-tidb-cluster.txt