Open Anlet opened 3 years ago
打印的错误日志:
Command: java -Dregistry.integration.home=/usr/local/software/registry-integration -Dspring.config.location=/usr/local/software/registry-integration/conf/application.properties -Duser.home=/usr/local/software/registry-integration -server -Xms512m -Xmx512m -Xmn256m -Xss256k -XX:+DisableExplicitGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/usr/local/software/registry-integration/logs/registry-integration-gc.log -verbose:gc -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/usr/local/software/registry-integration/logs -XX:ErrorFile=/usr/local/software/registry-integration/logs/registry-integration-hs_err_pid%p.log -XX:-OmitStackTraceInFastThrow -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:ParallelGCThreads=4 -XX:+CMSClassUnloadingEnabled -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=70 -jar /usr/local/software/registry-integration/registry-integration.jar --logging.config=/usr/local/software/registry-integration/conf/logback-spring.xml
Sofa-Middleware-Log SLF4J : Actual binding is of type [ com.alipay.remoting Logback ]
[2021-04-01 17:49:44,219][INFO][main][MetaServerBootstrap] - the configuration items are as follows: com.alipay.sofa.registry.server.meta.bootstrap.MetaServerConfigBean@7eac9008[
sessionServerPort=9610
dataServerPort=9611
metaServerPort=9612
httpServerPort=9615
schedulerHeartbeatTimeout=3
schedulerHeartbeatFirstDelay=3
schedulerHeartbeatExpBackOffBound=10
schedulerGetDataChangeTimeout=5
schedulerGetDataChangeFirstDelay=5
schedulerGetDataChangeExpBackOffBound=5
schedulerConnectMetaServerTimeout=3
schedulerConnectMetaServerFirstDelay=3
schedulerConnectMetaServerExpBackOffBound=10
schedulerCheckNodeListChangePushTimeout=3
schedulerCheckNodeListChangePushFirstDelay=1
schedulerCheckNodeListChangePushExpBackOffBound=10
dataNodeExchangeTimeout=3000
sessionNodeExchangeTimeout=3000
metaNodeExchangeTimeout=3000
dataCenterChangeNotifyTaskRetryTimes=3
dataNodeChangePushTaskRetryTimes=1
getDataCenterChangeListTaskRetryTimes=3
receiveStatusConfirmNotifyTaskRetryTimes=3
sessionNodeChangePushTaskRetryTimes=3
enableMetrics=true
decisionMode=
经过看源码,上网百度等一番折腾后:在注册中心(SOFARegistry)的启动配置文件中加上指定网卡(JAVA_OPTS="$JAVA_OPTS -Dnetwork_interface_binding=eth0"),终于可以正常启动了,三个端口检测均正常。正当我满心欢喜的填上远程注册中心的地址,然后本地项目启动的时候。访问网址时,居然报错了。没有获得服务。 查看日志(common-error.log)发现,我一直请求的是服务器内网的地址。 打开本地项目的配置文件确实没有配置错误,这就奇怪了?经过一番询问后,得知 注册中心要和客户端在同一个网段内。也就是说注册中心(SOFARegistry)在远程服务器上部署,本地项目使用远程服务的注册中心的地址。这样操作目前(v5.4.2)是不行的。
Good question, thanks to @Anlet
Let me explain the issue so that others would understand:
Q1: Registry fail to start due to an address issue
A: In case of multiple nic on single machine, SOFA-Registry provides a strategy to leverage user's knowledge on selecting a nic's ip address as main address
So, adding a system properties through java -Dnetwork_interface_binding=eth0
or System.setProperty()
by java code is an effective way to deal with this problem.
Q2: Registry client fail to connect to Registry center A: Sofa-Registry is designed in a way as shown below:
the question is about step-2, when session is using loop-back as its address reporting to client, while client is standing outside the machine(say, in a cloud env, client is on ECS-1 machine while sofa-registry is running standalone mode on ECS-2 machine, leveraging 127.0.0.1 as session's address)
So, for question - 2, we'd like to provide a mechanism that registry-client has the ability to receive an well-defined address(through -D param or a config file, either way is OK) as the registry center's address. In previous case, @Anlet could claim, say, 10.0.0.1:9622,10.0.0.2:9622 as the registry session's address-list
@Anlet 我上面回复了一下,好让其他同学也能理解问题的过程 目前我们给出的方案是,客户端可以通过-D参数或者config文件,指定session server(也就是注册中心入口)的地址信息
不知道有没有兴趣一起来完成这个Feature? @Anlet
按照方法(设置网卡,仅仅修改了这一个地方),启动日志都没有报错,但是health/check
当前版本version_5.4.5
的时候还是检查有问题:
Last login: Sun Oct 31 14:44:29 2021 from 10.0.2.2
[vagrant@localhost ~]$ netstat -anp|grep java
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp6 0 0 :::9610 :::* LISTEN 5953/java
tcp6 0 0 :::9611 :::* LISTEN 5953/java
tcp6 0 0 :::9612 :::* LISTEN 5953/java
tcp6 0 0 :::9615 :::* LISTEN 5953/java
tcp6 0 0 127.0.0.1:34864 127.0.0.1:9615 ESTABLISHED 5953/java
tcp6 0 0 127.0.0.1:9615 127.0.0.1:34864 ESTABLISHED 5953/java
unix 2 [ ] STREAM CONNECTED 33475 5953/java
unix 2 [ ] STREAM CONNECTED 33479 5953/java
[vagrant@localhost ~]$ curl http://localhost:9610/health/check
curl: (56) Recv failure: Connection reset by peer
[vagrant@localhost ~]$ curl http://localhost:9611/health/check
curl: (56) Recv failure: Connection reset by peer
[vagrant@localhost ~]$ curl http://localhost:9612/health/check
curl: (56) Recv failure: Connection reset by peer
[vagrant@localhost ~]$ curl http://localhost:9615/health/check
{"success":false,"message":"MetaServerBoot sessionRegisterServer:true, dataRegisterServerStart:true, otherMetaRegisterServerStart:true, httpServerStart:true, raftServerStart:false, raftClientStart:true, raftManagerStart:false, raftStatus:false"}[vagrant@localhost ~]$
网卡信息如下:
:true, raftServerStart:false, raftClientStart:true, raftManagerStart:false, raftStatus:false"}[vagrant@localhost ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 52:54:00:4d:77:d3 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global noprefixroute dynamic eth0
valid_lft 81337sec preferred_lft 81337sec
inet6 fe80::5054:ff:fe4d:77d3/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:a9:97:49 brd ff:ff:ff:ff:ff:ff
inet 192.168.33.10/24 brd 192.168.33.255 scope global noprefixroute eth1
valid_lft forever preferred_lft forever
inet6 fe80::a00:27ff:fea9:9749/64 scope link
valid_lft forever preferred_lft forever
修改的启动脚本涉及:
# set net
JAVA_OPTS="$JAVA_OPTS -Dnetwork_interface_binding=eth1"
因为使用vagrant 设置的网络配置:
# Create a private network, which allows host-only access to the machine
# using a specific IP.
config.vm.network "private_network", ip: "192.168.33.10"
不知道:
curl: (56) Recv failure: Connection reset by peer
和
[vagrant@localhost ~]$ curl http://localhost:9615/health/check
{"success":false,"message":"MetaServerBoot sessionRegisterServer:true, dataRegisterServerStart:true, otherMetaRegisterServerStart:true, httpServerStart:true, raftServerStart:false, raftClientStart:true, raftManagerStart:false, raftStatus:false"}
健康检查 返回false是为何意思。
@NickNYU 麻烦帮忙看看
启动日志:
[vagrant@localhost ~]$ curl http://localhost:9615/health/check {"success":false,"message":"MetaServerBoot sessionRegisterServer:true, dataRegisterServerStart:true, otherMetaRegisterServerStart:true, httpServerStart:true, raftServerStart:false, raftClientStart:true, raftManagerStart:false, raftStatus:false"}
Describe the bug
服务端使用SOFARegistry v5.4.2 部署注册中心时,会出现9622端口打开成功,但是检查失败的问题。
Expected behavior
查看meta角色的健康检测接口:
$ curl http://localhost:9615/health/check {"success":true,"message":"... raftStatus:Leader"}
查看data角色的健康检测接口:
$ curl http://localhost:9622/health/check {"success":true,"message":"... status:WORKING"}
查看session角色的健康检测接口:
$ curl http://localhost:9603/health/check {"success":true,"message":"..."}
Actual behavior
查看meta角色的健康检测接口:
$ curl http://localhost:9615/health/check {"success":true,"message":"... raftStatus:Leader"}
查看data角色的健康检测接口:
$ curl http://localhost:9622/health/check {"success":false,"message":"DataServerBoot severForSession:true, severForDataSync:true, httpServer:true, schedulerStarted:true, status:INITIAL"}
查看session角色的健康检测接口:
$ curl http://localhost:9603/health/check curl: (7) Failed to connect to localhost port 9603: Connection refused
Steps to reproduce
Minimal yet complete reproducer code (or GitHub URL to code)
Environment
SOFARegistry version: v5.4.2
JVM version (e.g.
java -version
):OS version (e.g.
uname -a
):Maven version:
IDE version: