linkedin / rest.li

Rest.li is a REST+JSON framework for building robust, scalable service architectures using dynamic discovery and simple asynchronous APIs.
rest.li
Other
2.51k stars 546 forks source link

Dynamically switch jmx/sensor names based on dual read mode and source type #926

Closed bohhyang closed 1 year ago

bohhyang commented 1 year ago

Background

Currently Jmx/sensors (load balancer, load balancer state, file stores, clusterInfo, serviceProperties, LoadBalancerStrategy, etc) are having the same name in ZK and xDS read flow. The current code intended to give D2ClientJmxManager.java a different prefix in xDS flow ("-xDS"), which had a bug in container and wasn't working. Even if worked, it would still break the jmx/sensors because depending on the dynamically changing dual read mode value, the primary source (ZK in OLD_LB_ONLY and DUAL_READ modes; xDS in NEW_LB_ONLY mode) should register the jmx/sensor names same as today, so that users can still monitor the same metrics. And, the secondary source should register different names to avoid conflicting with the primary.

Changes

  1. added watchers to dual read mode and dynamically register jmx/sensor names for primary/secondary sources under the new dual read mode.
  2. Include the source type name ("-xDS", "-ZK") in the secondary source's jmx/sensor names.
  3. bumped major version. Will create a container PR to include this change to container. Since the previous 29.43.xxx versions will have broken Jmx/sensor names if switched to observer-only read mode, I'll deprecate those versions once this is released.

Test Done

  1. added unit tests for add/remove watchers and registering jmx/sensor names.
  2. QEI deployment with Toki and switch dual read modes with lix: Under dual read or zk-only modes, the ZK source is primary, and xDS source registered the names with "-xDS": Screenshot 2023-08-02 at 5 24 03 PM

Under observer-only mode, app log shows the primary names are unregistered and re-registered (done by xDS source), and the "-ZK" names are registered (by ZK source): Screenshot 2023-08-06 at 4 58 19 PM , and the jmx/sensor names also show the same: Screenshot 2023-08-06 at 5 25 08 PM

Note: the "-xDS" names are left there (not unregistered), since they won't hurt during the dual read testing and will be cleaned when the app adopts xDS load balancer (by config, so dual read load balancer is not used at all) and redeploys.

bohhyang commented 1 year ago

discussed with @PapaCharlie over slack, it's better to separate out the watchers for different property types into separate maps or member vars, so it's less error-prone for future devs to add watchers for new types of properties. I did so and also moved the watcher management logic to a separate manager class for cleanlyness.