yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
8.88k stars 1.05k forks source link

postgres use 9100 on the server_broadcast_addresses #16146

Open christiancadieux opened 1 year ago

christiancadieux commented 1 year ago

Jira Link: DB-5582

Description

ysqlsh cannot connect

postgres uses the IP of the server_broadcast_addresses with the default port of the rpc_bind_addresses ( 9100).

used charts to create a 3 node cluster. I changed all the references to port 9100 in the services and overrides , but port 9100 is still used - seems to be hard-coded somewhere:

# master
--rpc_bind_addresses=$(HOSTNAME).yb-masters.$(NAMESPACE).svc.cluster.local \
--server_broadcast_addresses=10.27.49.5:7100 \
...
# tserver
 --rpc_bind_addresses=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local \
 --server_broadcast_addresses=10.27.49.5:7200 \

ERROR:

I0217 04:33:07.746546   507 pg_client.cc:125] Using TServer host_port: 10.27.49.5:9100
I0217 04:33:07.747642   512 tcp_stream.cc:322] { local: 192.168.106.50:44844 remote: 10.27.49.5:9100 }:  Recv failed: Network error (yb/util/net/socket.cc:540): recvmsg error: Connection refused (system error 111)

chart that I used:

Services:

ddorian commented 1 year ago

Hi @christiancadieux

You can change the port with rpc-bind-addresses flag https://docs.yugabyte.com/preview/reference/configuration/yb-tserver/#rpc-bind-addresses

christiancadieux commented 1 year ago

I have this :

# master
--rpc_bind_addresses=$(HOSTNAME).yb-masters.$(NAMESPACE).svc.cluster.local \
--server_broadcast_addresses=10.27.49.5:7100 \
...
# tserver
 --rpc_bind_addresses=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local \
 --server_broadcast_addresses=10.27.49.5:7200 \

and --use_private_ip=region

the tserver is at:

yb-tserver-0                     2/2     Running   0            8h    192.168.106.50

but the error is

I0217 04:33:07.746546 507 pg_client.cc:125] Using TServer host_port: 10.27.49.5:9100
I0217 04:33:07.747642 512 tcp_stream.cc:322] { local: 192.168.106.50:44844 remote: 10.27.49.5:9100 }: Recv failed: Network error (yb/util/net/socket.cc:540): recvmsg error: Connection refused (system error 111)

which suggest that it's trying to use the external IP (10.27.49.5:9100) so it's using the IP of the server_broadcast_addresses and the default port of the rpc_bind_addresses (I guess).

if should not use the external IP since it's the same region - or what am I missing.

ddorian commented 1 year ago

Use 7100 for yb-master & 9100 for yb-tserver. Those are used by default at least https://docs.yugabyte.com/preview/reference/configuration/default-ports/#internode-rpc-communication

christiancadieux commented 1 year ago

I know the default is 9100, but I am not allowed to use ports above 9000 in my company, that's why I changed it in the overrides:

  - name: "yb-masters"
    label: "yb-master"
    skipHealthChecks: false
    memory_limit_to_ram_ratio: 0.85
    ports:
      http-ui: "7000"
      tcp-rpc-port: "7100"

  - name: "yb-tservers"
    label: "yb-tserver"
    skipHealthChecks: false
    ports:
      tcp-rpc-port: "7200"
      tcp-yql-port: "7300"
      ...

is there an override for this rpc_bind_addresses port. my rpc_bind_addresses is

--rpc_bind_addresses=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local \

but it's calling 10.27.49.5:9100 so it does not seem to be using the rpc_bind_addresses, it's using the server_broadcast_addresses with the default port of the rpc_bind_addresses. It can use $(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local:9100 (that fine, that's internal to the cluster), but it cannot use 10.27.49.5:9100 (that a port on an external IP and that's now allowed and it does not exists and why is it trying to use an external port to communicate internally with a pod - this error I have occured when I tried to run ysqlsh from inside the pod)

christiancadieux commented 1 year ago

tried different scenarios: --use_node_hostname_for_local_tserver=false

UI error

Error retrieving leader master URL: http://10.60.7.13:7000/?raw Error: Network error (yb/util/curl_util.cc:57): curl error: Couldn't connect to server.

change port

$ egrep 'use_node|rpc_bind' zone* zone1.yaml: --rpc_bind_addresses=$(HOSTNAME).yb-masters.$(NAMESPACE).svc.cluster.local \ zone1.yaml: --use_node_hostname_for_local_tserver=true \ zone1.yaml: --rpc_bind_addresses=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local:7200 \ zone2.yaml: --rpc_bind_addresses=$(HOSTNAME).yb-masters.$(NAMESPACE).svc.cluster.local \ zone2.yaml: --use_node_hostname_for_local_tserver=true \ zone2.yaml: --rpc_bind_addresses=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local:7201 \ zone3.yaml: --rpc_bind_addresses=$(HOSTNAME).yb-masters.$(NAMESPACE).svc.cluster.local \ zone3.yaml: --use_node_hostname_for_local_tserver=true \ zone3.yaml: --rpc_bind_addresses=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local:7200 \

ui still in error - ysqlsh hang

ysqlsh -h yb-tserver-0 ysqlsh: FATAL: Network error: Connect timeout Connection (0x00000000025cf8d8) client 192.168.107.29:43864 => 10.27.49.5:7200, passed: 15.000s, timeout: 15.000s: kConnectFailed

tservers are active: Tablet Server UUID RPC Host/Port Heartbeat delay Status Reads/s Writes/s Uptime SST total size SST uncomp size SST #files Memory Broadcast Host/Port 517ce0ead1a14e0aa58fce08286671be yb-tserver-0.yb-tservers.rdei-yb-zone3.svc.cluster.local:7200 0.94s ALIVE 0.00 0.00 26 0 B 0 B 0 29.36 MB 10.60.7.13:7200 e03e4da53c9d4c3fa58d23dc41954ed4 yb-tserver-0.yb-tservers.rdei-yb-zone1.svc.cluster.local:7200 0.93s ALIVE 0.00 0.00 26 0 B 0 B 0 29.36 MB 10.27.49.5:7200 de2147fefed94249ac49dbff13ec1f86 yb-tserver-0.yb-tservers.rdei-yb-zone2.svc.cluster.local:7201 0.06s ALIVE 0.00 0.00 25 0 B 0 B 0 14.68 MB 10.27.49.5:7201

ddorian commented 1 year ago

Please check the logs and upload them here. 9100 is for connecting inter-node RPCs, 5433 is for YSQL layer.

christiancadieux commented 1 year ago
the tserver is configured with:
              --rpc_bind_addresses=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local \
              --server_broadcast_addresses=10.27.49.5:7200 \
              --webserver_interface=0.0.0.0 \
              --enable_ysql=true \
              --pgsql_proxy_bind_address=0.0.0.0:5433 \

but the postgres error in the tserver is:

I0217 04:33:07.746546   507 pg_client.cc:125] Using TServer host_port: 10.27.49.5:9100
I0217 04:33:07.747642   512 tcp_stream.cc:322] { local: 192.168.106.50:44844 remote: 10.27.49.5:9100 }: 

it's trying to use the externalIP of the tserver pod (10.27.49.5) from inside that same tserver , on port 9100 . 2 problems with that:

christiancadieux commented 1 year ago

I was able to fix postgres by changing

FROM:
--server_broadcast_addresses=10.27.49.5:7200 \ 

TO:
--server_broadcast_addresses=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local:7200 

then I can run ysqlsh and get the prompt : tables can be created but cli hang since tablets are not accessible .

seems to be in the same category of problems as this one for example https://github.com/yugabyte/yugabyte-db/issues/11603

christiancadieux commented 1 year ago

dump_master_state

I0219 16:00:27.148634 31 reactor.cc:467] Master_R000: DEBUG: Closing idle connection: Connection (0x000055b7eca9e018) server 192.168.106.17:33919 => 192.168.106.17:7100 - it has been idle for 65.0015s I0219 16:00:43.148783 33 reactor.cc:467] Master_R002: DEBUG: Closing idle connection: Connection (0x000055b7eb7d1578) server 192.168.106.17:38375 => 192.168.106.17:7100 - it has been idle for 65.0995s I0219 16:00:57.712122 149 master_cluster_service.cc:202] Follower Master 96ed2a36eb184f539659eb890fc49234 I0219 16:00:57.712208 149 master_cluster_service.cc:203] Dumping current state of master. Namespaces: Tables: Master options : yb-master-0.yb-masters.rdei-yb-zone1.svc.cluster.local:7100, yb-master-0.yb-masters.rdei-yb-zone2.svc.cluster.local:7100, 10.60.7.13:7100 Current raft config: current_term: 1 leader_uuid: "1fe370c0712249e3a5cf54200875dc8a" config { opid_index: -1 peers { permanent_uuid: "96ed2a36eb184f539659eb890fc49234" member_type: VOTER last_known_private_addr { host: "yb-master-0.yb-masters.rdei-yb-zone1.svc.cluster.local" port: 7100 } last_known_broadcast_addr { host: "10.27.49.5" port: 7100 } cloud_info { placement_cloud: "rdei" placement_region: "region1" placement_zone: "zone1" } } peers { permanent_uuid: "f5c684c236da421ba5b04b81964ced73" member_type: VOTER last_known_private_addr { host: "yb-master-0.yb-masters.rdei-yb-zone2.svc.cluster.local" port: 7100 } last_known_broadcast_addr { host: "10.27.49.5" port: 7101 } cloud_info { placement_cloud: "rdei" placement_region: "region1" placement_zone: "zone2" } } peers { permanent_uuid: "1fe370c0712249e3a5cf54200875dc8a" member_type: VOTER last_known_private_addr { host: "yb-master-0.yb-masters.rdei-yb-zone3.svc.cluster.local" port: 7100 } last_known_broadcast_addr { host: "10.60.7.13" port: 7100 } cloud_info { placement_cloud: "rdei" placement_region: "region3" placement_zone: "zone3" } } }

christiancadieux commented 1 year ago

with this config, I see this in the postgres log

--rpc_bind_addresses=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local \
              --server_broadcast_addresses=10.27.49.5:7200 \
              --webserver_interface=0.0.0.0 \
              --enable_ysql=true \
              --pgsql_proxy_bind_address=0.0.0.0:5433 \
              --cql_proxy_bind_address=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local

[root@yb-tserver-0 logs]# tail postgresql-2023-02-19_050735.log

I0219 16:04:51.773411 20695 pg_client.cc:135] Session id 0: Session id acquired. Postgres backend pid: 20695 2023-02-19 16:04:51.773 UTC [20695] FATAL: Network error: recvmsg error: Connection refused I0219 16:04:51.773923 20699 poller.cc:66] Poll stopped: Service unavailable (yb/rpc/scheduler.cc:80): Scheduler is shutting down (system error 108) I0219 16:04:57.816169 20711 mem_tracker.cc:247] Overriding FLAGS_mem_tracker_tcmalloc_gc_release_bytes to 5242880 I0219 16:04:57.817169 20711 thread_pool.cc:167] Starting thread pool { name: pggate_ybclient max_workers: 1024 } I0219 16:04:57.817795 20711 pg_client.cc:128] Using TServer host_port: 10.27.49.5:9100 I0219 16:04:57.818861 20716 tcp_stream.cc:322] { local: 192.168.104.213:35210 remote: 10.27.49.5:9100 }: Recv failed: Network error (yb/util/net/socket.cc:540): recvmsg error: Connection refused (system error 111) I0219 16:04:57.819073 20711 pg_client.cc:135] Session id 0: Session id acquired. Postgres backend pid: 20711 2023-02-19 16:04:57.819 UTC [20711] FATAL: Network error: recvmsg error: Connection refused I0219 16:04:57.819725 20713 poller.cc:66] Poll stopped: Service unavailable (yb/rpc/scheduler.cc:80): Scheduler is shutting down (system error 108)


#### configuration

version: 1
replication_info {
  live_replicas {
    num_replicas: 3
    placement_blocks {
      cloud_info {
        placement_cloud: "rdei"
        placement_region: "region1"
        placement_zone: "zone2"
      }
      min_num_replicas: 1
    }
    placement_blocks {
      cloud_info {
        placement_cloud: "rdei"
        placement_region: "region3"
        placement_zone: "zone3"
      }
      min_num_replicas: 1
    }
    placement_blocks {
      cloud_info {
        placement_cloud: "rdei"
        placement_region: "region1"
        placement_zone: "zone1"
      }
      min_num_replicas: 1
    }
  }
}
cluster_uuid: "e47afac2-111f-43d8-ac74-9c36ed945592"
christiancadieux commented 1 year ago

I also noticed that if I remove the '--server_broadcast_addresses=..' from the yb-tserver on a node in one of the 2 clusters, then I can exec in the pod and run ysqlsh (ysqlsh does not hang anymore), create table and insert record.

in this case, the postgres log show :

Using TServer host_port: yb-tserver-0.yb-tservers.rdei-yb-zone1.svc.cluster.local:9100       <<< WORKS

instead of the previous:

I0217 04:33:07.746546   507 pg_client.cc:125] Using TServer host_port: 10.27.49.5:9100   <<< DOES NOT
I0217 04:33:07.747642   512 tcp_stream.cc:322] { local: 192.168.106.50:44844 remote: 10.27.49.5:9100 }:

but If I try to do the same thing on the other cluster, then I can also run ysqlsh and list tables, but accessing data hangs .

errors

adding proxies should have worked but I did get these errors:

christiancadieux commented 1 year ago

so to summarize,

zone2.yaml: --rpc_bind_addresses=$(HOSTNAME).yb-masters.$(NAMESPACE).svc.cluster.local \ zone2.yaml: --server_broadcast_addresses=10.27.49.5:7101 \ zone2.yaml: --rpc_bind_addresses=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local \ zone2.yaml: --server_broadcast_addresses=10.27.49.5:7201 \ zone2.yaml: --cql_proxy_bind_address=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local

zone3.yaml: --rpc_bind_addresses=$(HOSTNAME).yb-masters.$(NAMESPACE).svc.cluster.local \ zone3.yaml: --server_broadcast_addresses=10.60.7.13:7100 \ zone3.yaml: --rpc_bind_addresses=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local \ zone3.yaml: --server_broadcast_addresses=10.60.7.13:7200 \ zone3.yaml: --cql_proxy_bind_address=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local

- postgres ccannot be accessed (from cli in the pod or otherwise)  - seems to be a configuration issues in the YB code specific to postgres. I tried it different ways, with proxies, without proxies, without pgsql_proxy_bind_address, with pgsql_proxy_bind_address=headless-pod-address ...:

zone1.yaml: --pgsql_proxy_bind_address=0.0.0.0:5433 \ zone2.yaml: --pgsql_proxy_bind_address=0.0.0.0:5433 \ zone3.yaml: --pgsql_proxy_bind_address=0.0.0.0:5433 \

christiancadieux commented 1 year ago

I got it to work with extra headless services (without selector) instead of proxies, and these values:

my 2 main problems were:

TSERVER

--tserver_master_addrs=yb-master-0.yb-masters.rdei-yb-zone1.svc.cluster.local:7100,yb-master-0.yb-masters.rdei-yb-zone2.svc.cluster.local:7101,10.60.7.13:7100 \ --rpc_max_message_size=2000000000 \ --metric_node_name=$(HOSTNAME) \ --memory_limit_hard_bytes=3649044480 \ --stderrthreshold=0 \ --num_cpus=0 \ --undefok=num_cpus,enable_ysql \ --use_node_hostname_for_local_tserver=true \ --leader_failure_max_missed_heartbeat_periods="10" \ --placement_cloud="rdei" \ --placement_region="region1" \ --placement_zone="zone1" \ --use_private_ip="region" \ --rpc_bind_addresses=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local:7200 \ --server_broadcast_addresses=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local:7200 \ --webserver_interface=0.0.0.0 \ --enable_ysql=true \ --pgsql_proxy_bind_address=0.0.0.0:5433 \ --cql_proxy_bind_address=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local


[multi.tar.gz](https://github.com/yugabyte/yugabyte-db/files/10809512/multi.tar.gz)
ddorian commented 1 year ago

@christiancadieux can we close this as completed?

christiancadieux commented 1 year ago

my solution involves creating 9 namespaces (instead of 3) and 12 extra services, so it's not very good solution . I was hoping that my explanations would help understand the problem - but it's probably way too much details. So is he a summary of the summaries: when I enter in the tserver sts :

    --rpc_bind_addresses=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local:7200 \
    --server_broadcast_addresses=10.27.49.5:7200 \    << external IP
    --pgsql_proxy_bind_address=0.0.0.0:5433 \

cassandra works but postgres hang on ysqlsh. And when I enter this instead:

    --rpc_bind_addresses=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local:7200 \
    --server_broadcast_addresses=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local:7200 \
    --pgsql_proxy_bind_address=0.0.0.0:5433 \

then ysqlsh does not hang anymore but the data is not shared with nodes in other clusters, so ysqlsh on a different node can display tables (\d) but not content of tables.

I don't know the code but it looks like the IP that is needed to connect to postgres locally (in this case $(HOSTNAME).yb-tservers.$(NAMESPACE) cannot be used by other nodes to connect to this node, other nodes need to use '10.27.49.5:7200' (the external IP for that pod). that's why when I create extra headlessservices/namespace in the other nodes to mimic the remote services , it all work ( but it's a lot of overhead ).

maybe a way to maybe fix this (for postgres) would look like this:

     --server_broadcast_addresses=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local:7200 \ 
     --server_broadcast_addresses_remote_access=10.27.49.5:7200   << share with other nodes
christiancadieux commented 1 year ago

I have a good example where it fails. I configured the tservers sts like this. remember that I installed 3 namespaces in 3 kube clusters and I named the namespaces differently: in cluster1 , the namespace is rdei-yb-zone1, in cluster2, the namespace is rdei-yb-zone2, etc..

Here is one of them:

--rpc_bind_addresses=$(HOSTNAME).yb-tservers.$(NAMESPACE).svc.cluster.local:7200 \
            --server_broadcast_addresses=10.27.49.5:7200 \     << that's the externalIP 
            --webserver_interface=0.0.0.0 \
            --enable_ysql=true \
            --pgsql_proxy_bind_address=0.0.0.0:5433 \

10.27.49.5 is the external IP of this sts. but the log of that same tserver (tserver1) shows this:

W0301 16:06:05.145804   321 leader_election.cc:277] T 095974730df34c278093b95abec799a4 P b5ef4117647549e0ba2a4ab707324150 [CANDIDATE]: Term 7 pre-election: RPC error from VoteRequest() call to peer d2dcbe2021c046fab41957aefce0ca9d: Network error (yb/util/net/dns_resolver.cc:65): Resolve failed yb-tserver-0.yb-tservers.rdei-yb-zone2.svc.cluster.local: Host not found (authoritative)
W0301 16:06:05.145889   254 leader_election.cc:277] T 095974730df34c278093b95abec799a4 P b5ef4117647549e0ba2a4ab707324150 [CANDIDATE]: Term 7 pre-election: RPC error from VoteRequest() call to peer dbd4d6904a35429f87da2e787289e35e: Network error (yb/util/net/dns_resolver.cc:65): Resolve failed yb-tserver-0.yb-tservers.rdei-yb-zone3.svc.cluster.local: Host not found (authoritative)

as you can see, tserver2 and tserver3 shared their internalIP with this tserver (tserver1), so now this tserver is trying to reach yb-tserver-0.yb-tservers.rdei-yb-zone2.svc.cluster.local, which of course does not exist, since that's the internalIP of tserver2.

that's I think the best example of this bug.