dianping / cat

CAT 作为服务端项目基础组件,提供了 Java, C/C++, Node.js, Python, Go 等多语言客户端,已经在美团点评的基础架构中间件框架(MVC框架,RPC框架,数据库框架,缓存框架等,消息队列,配置系统等)深度集成,为美团点评各业务线提供系统丰富的性能指标、健康状况、实时告警等。
Apache License 2.0
18.69k stars 5.43k forks source link

客户端路由配置权重并没有按照权重进行负载 #2226

Open tjzheng1002 opened 2 years ago

tjzheng1002 commented 2 years ago

针对项目bigger-api-service进行权重分配,预期是存储到192.168.91.58这台机器的消息应该比较多,然而实际是只会存储到第一台机器192.168.91.57。 客户端日志: [06-14 11:39:21.270] [INFO] [TcpSocketSender] router config changed :192.168.91.57:2280;192.168.91.58:2280; [06-14 11:39:21.270] [INFO] [TcpSocketSender] start connect server/192.168.91.57:2280 [06-14 11:39:21.277] [INFO] [TcpSocketSender] Connected to CAT server at /192.168.91.57:2280 [06-14 11:39:21.277] [INFO] [TcpSocketSender] success when init CAT server, new active holderactive future :/192.168.91.57:2280 index:0 ip:192.168.91.57 server config:192.168.91.57:2280;192.168.91.58:2280; [06-14 11:39:21.277] [INFO] [TcpSocketSender] close channel /192.168.91.58:2280 [06-14 11:39:21.278] [INFO] [TcpSocketSender] switch active channel to active future :/192.168.91.57:2280 index:0 ip:192.168.91.57 server config:192.168.91.57:2280;192.168.91.58:2280; 客户端路由配置如下:

<?xml version="1.0" encoding="utf-8"?>
<router-config backup-server="192.168.91.58" backup-server-port="2280">
   <default-server id="192.168.91.57" weight="1.0" port="2280" enable="true"/>
   <default-server id="192.168.91.58" weight="1.0" port="2280" enable="true"/>
   <network-policy id="default" title="default" block="false" server-group="default_group">
   </network-policy>
   <server-group id="default_group" title="default-group">
      <group-server id="192.168.91.57"/>
      <group-server id="192.168.91.58"/>
   </server-group>
   <domain id="cat">
      <group id="default">
         <server id="192.168.91.57" port="2280" weight="1.0"/>
         <server id="192.168.91.58" port="2280" weight="1.0"/>
      </group>
   </domain>
   <domain id="bigger-api-service">
      <group id="default">
         <server id="192.168.91.57" port="2280" weight="1.0"/>
         <server id="192.168.91.58" port="2280" weight="4.0"/>
      </group>
   </domain>
</router-config>
tjzheng1002 commented 2 years ago

客户端路由配置如下:

<?xml version="1.0" encoding="utf-8"?>
<router-config backup-server="192.168.91.58" backup-server-port="2280">
   <default-server id="192.168.91.57" weight="1.0" port="2280" enable="true"/>
   <default-server id="192.168.91.58" weight="1.0" port="2280" enable="true"/>
   <network-policy id="default" title="default" block="false" server-group="default_group">
   </network-policy>
   <server-group id="default_group" title="default-group">
      <group-server id="192.168.91.57"/>
      <group-server id="192.168.91.58"/>
   </server-group>
   <domain id="cat">
      <group id="default">
         <server id="192.168.91.57" port="2280" weight="1.0"/>
         <server id="192.168.91.58" port="2280" weight="1.0"/>
      </group>
   </domain>
   <domain id="bigger-api-service">
      <group id="default">
         <server id="192.168.91.57" port="2280" weight="1.0"/>
         <server id="192.168.91.58" port="2280" weight="4.0"/>
      </group>
   </domain>
</router-config>
qmwu2000 commented 2 years ago

CAT默认路由策略是静态路由策略,它以顺序为优先级,客户端会依次对路由地址进行连接尝试, 直到连接成功。

只要第一台可用,它就会一直使用第一台。

On Jun 14, 2022, at 11:48, tjzheng1002 @.***> wrote:

客户端路由配置如下:

— Reply to this email directly, view it on GitHub https://github.com/dianping/cat/issues/2226#issuecomment-1154677966, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASQE744ERDGYJPZKRRG5ZLVO76KNANCNFSM5YWGXDXQ. You are receiving this because you are subscribed to this thread.

tjzheng1002 commented 2 years ago

你好,如何调整默认路由策略,使权重配置生效? @qmwu2000

CAT默认路由策略是静态路由策略,它以顺序为优先级,客户端会依次对路由地址进行连接尝试, 直到连接成功。 只要第一台可用,它就会一直使用第一台。 On Jun 14, 2022, at 11:48, tjzheng1002 @.***> wrote: 客户端路由配置如下: — Reply to this email directly, view it on GitHub <#2226 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASQE744ERDGYJPZKRRG5ZLVO76KNANCNFSM5YWGXDXQ. You are receiving this because you are subscribed to this thread.

qmwu2000 commented 2 years ago

com.dianping.cat.system.page.router#buildRouterInfo 这个方法需要调整,可用根据domain & ip 生成不同的server endpoints。

On Jun 14, 2022, at 13:25, tjzheng1002 @. @.>> wrote:

针对项目bigger-api-service进行权重分配,预期是存储到192.168.91.58这台机器的消息应该比较多,然而实际是只会存储到第一台机器192.168.91.57。 客户端日志: [06-14 11:39:21.270] [INFO] [TcpSocketSender] router config changed :192.168.91.57:2280;192.168.91.58:2280; [06-14 11:39:21.270] [INFO] [TcpSocketSender] start connect server/192.168.91.57:2280 [06-14 11:39:21.277] [INFO] [TcpSocketSender] Connected to CAT server at /192.168.91.57:2280 [06-14 11:39:21.277] [INFO] [TcpSocketSender] success when init CAT server, new active holderactive future :/192.168.91.57:2280 index:0 ip:192.168.91.57 server config:192.168.91.57:2280;192.168.91.58:2280; [06-14 11:39:21.277] [INFO] [TcpSocketSender] close channel /192.168.91.58:2280 [06-14 11:39:21.278] [INFO] [TcpSocketSender] switch active channel to active future :/192.168.91.57:2280 index:0 ip:192.168.91.57 server config:192.168.91.57:2280;192.168.91.58:2280; 客户端路由配置如下:

你好,如何调整默认路由策略,使权重配置生效?

CAT默认路由策略是静态路由策略,它以顺序为优先级,客户端会依次对路由地址进行连接尝试, 直到连接成功。 只要第一台可用,它就会一直使用第一台。 … <x-msg://4/#> On Jun 14, 2022, at 11:48, tjzheng1002 @.***> wrote: 客户端路由配置如下: — Reply to this email directly, view it on GitHub <#2226 (comment) https://github.com/dianping/cat/issues/2226#issuecomment-1154677966>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASQE744ERDGYJPZKRRG5ZLVO76KNANCNFSM5YWGXDXQ https://github.com/notifications/unsubscribe-auth/AASQE744ERDGYJPZKRRG5ZLVO76KNANCNFSM5YWGXDXQ. You are receiving this because you are subscribed to this thread.

CAT默认路由策略是静态路由策略,它以顺序为优先级,客户端会依次对路由地址进行连接尝试, 直到连接成功。 只要第一台可用,它就会一直使用第一台。 … <x-msg://4/#> On Jun 14, 2022, at 11:48, tjzheng1002 @.***> wrote: 客户端路由配置如下: — Reply to this email directly, view it on GitHub <#2226 (comment) https://github.com/dianping/cat/issues/2226#issuecomment-1154677966>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASQE744ERDGYJPZKRRG5ZLVO76KNANCNFSM5YWGXDXQ https://github.com/notifications/unsubscribe-auth/AASQE744ERDGYJPZKRRG5ZLVO76KNANCNFSM5YWGXDXQ. You are receiving this because you are subscribed to this thread.

— Reply to this email directly, view it on GitHub https://github.com/dianping/cat/issues/2226#issuecomment-1154727811, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASQE7ZDVPQJ7WZHMWOGLB3VPAJS3ANCNFSM5YWGXDXQ. You are receiving this because you commented.

tjzheng1002 commented 2 years ago

你好,修改buildRouterInfo是不是会影响返回的路由结果,比如我从router?domain=bigger-api-service&ip=192.168.166.50&op=xml这个接口抓到的返回信息属性routers,是不是要把权重信息放到这个属性里面? 然后通过cat客户端的TcpSocketSender根据返回的权重去做负载? @qmwu2000 router?domain=bigger-api-service&ip=192.168.166.50&op=xml返回信息: `<?xml version="1.0" encoding="utf-8"?>

` > com.dianping.cat.system.page.router#buildRouterInfo 这个方法需要调整,可用根据domain & ip 生成不同的server endpoints。
tjzheng1002 commented 2 years ago

理解下来,你的意思应该是在服务端做负载,客户端自动会刷新上报的服务器地址 @qmwu2000

你好,修改buildRouterInfo是不是会影响返回的路由结果,比如我从router?domain=bigger-api-service&ip=192.168.166.50&op=xml这个接口抓到的返回信息属性routers,是不是要把权重信息放到这个属性里面? 然后通过cat客户端的TcpSocketSender根据返回的权重去做负载? @qmwu2000 router?domain=bigger-api-service&ip=192.168.166.50&op=xml返回信息: <?xml version="1.0" encoding="utf-8"?> <property-config> <property id="startTransactionTypes" value="Cache.;Squirrel."/> <property id="matchTransactionTypes" value="SQL"/> <property id="block" value="false"/> <property id="routers" value="192.168.91.57:2280;192.168.91.58:2280;"/> <property id="sample" value="1.0"/> </property-config>

com.dianping.cat.system.page.router#buildRouterInfo 这个方法需要调整,可用根据domain & ip 生成不同的server endpoints。

qmwu2000 commented 2 years ago

是的,服务端做策略比较合理,尽量让客户端保持简单。

On Jun 14, 2022, at 16:21, tjzheng1002 @.***> wrote:

理解下来,你的意思应该是在服务端做负载,客户端自动会刷新上报的服务器地址 @qmwu2000 https://github.com/qmwu2000 你好,修改buildRouterInfo是不是会影响返回的路由结果,比如我从router?domain=bigger-api-service&ip=192.168.166.50&op=xml这个接口抓到的返回信息属性routers,是不是要把权重信息放到这个属性里面? 然后通过cat客户端的TcpSocketSender根据返回的权重去做负载? @qmwu2000 https://github.com/qmwu2000 router?domain=bigger-api-service&ip=192.168.166.50&op=xml返回信息: <?xml version="1.0" encoding="utf-8"?>

com.dianping.cat.system.page.router#buildRouterInfo 这个方法需要调整,可用根据domain & ip 生成不同的server endpoints。

— Reply to this email directly, view it on GitHub https://github.com/dianping/cat/issues/2226#issuecomment-1154869210, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASQE75VHM3EIMIZJ66MVG3VPA6HVANCNFSM5YWGXDXQ. You are receiving this because you were mentioned.

shouhualin commented 2 years ago

针对项目bigger-api-service进行权重分配,预期是存储到192.168.91.58这台机器的消息应该比较多,然而实际是只会存储到第一台机器192.168.91.57。 客户端日志: [06-14 11:39:21.270] [INFO] [TcpSocketSender] router config changed :192.168.91.57:2280;192.168.91.58:2280; [06-14 11:39:21.270] [INFO] [TcpSocketSender] start connect server/192.168.91.57:2280 [06-14 11:39:21.277] [INFO] [TcpSocketSender] Connected to CAT server at /192.168.91.57:2280 [06-14 11:39:21.277] [INFO] [TcpSocketSender] success when init CAT server, new active holderactive future :/192.168.91.57:2280 index:0 ip:192.168.91.57 server config:192.168.91.57:2280;192.168.91.58:2280; [06-14 11:39:21.277] [INFO] [TcpSocketSender] close channel /192.168.91.58:2280 [06-14 11:39:21.278] [INFO] [TcpSocketSender] switch active channel to active future :/192.168.91.57:2280 index:0 ip:192.168.91.57 server config:192.168.91.57:2280;192.168.91.58:2280; 客户端路由配置如下:

<?xml version="1.0" encoding="utf-8"?>
<router-config backup-server="192.168.91.58" backup-server-port="2280">
   <default-server id="192.168.91.57" weight="1.0" port="2280" enable="true"/>
   <default-server id="192.168.91.58" weight="1.0" port="2280" enable="true"/>
   <network-policy id="default" title="default" block="false" server-group="default_group">
   </network-policy>
   <server-group id="default_group" title="default-group">
      <group-server id="192.168.91.57"/>
      <group-server id="192.168.91.58"/>
   </server-group>
   <domain id="cat">
      <group id="default">
         <server id="192.168.91.57" port="2280" weight="1.0"/>
         <server id="192.168.91.58" port="2280" weight="1.0"/>
      </group>
   </domain>
   <domain id="bigger-api-service">
      <group id="default">
         <server id="192.168.91.57" port="2280" weight="1.0"/>
         <server id="192.168.91.58" port="2280" weight="4.0"/>
      </group>
   </domain>
</router-config>

并不是固定的。服务端根据domain的hash取余,然后给客户端两个地址。也就是说一样的domain返回给客户端的路由一直一样的。一定程度上避免了domain数据发往多台服务器上。

tjzheng1002 commented 2 years ago

@shouhualin 这样会导致每台机器消费和存储消息负载不均衡,直接表现在每台cat服务器的cpu、内存、磁盘、宽带使用不均衡,实际生产也是这样的。

并不是固定的。服务端根据domain的hash取余,然后给客户端两个地址。也就是说一样的domain返回给客户端的路由一直一样的。一定程度上避免了domain数据发往多台服务器上。