linkedin / rest.li

Rest.li is a REST+JSON framework for building robust, scalable service architectures using dynamic discovery and simple asynchronous APIs.
rest.li
Other
2.51k stars 546 forks source link

Add null guard and timeout process for INDIS response #975

Closed brycezhongqing closed 8 months ago

brycezhongqing commented 9 months ago

Context

This PR mainly focus on 3 points.

  1. Add log for ToggleingPublisher when switch one type Publisher to another Publisher
  2. Add null guard for INDIS and Zookeeper response.
  3. Add timeout process logic for INDIS xds response, and also process the corner case for removal resource

Note: When the client side receive the removal resource from observer that means there is no data from INDIS, so client could read the service/cluster/uri properties from cache directly

Test

[✅] unit test [✅] regression test in toki

Step1 : unzip the following files to services.zip

(clear this folder before unzip)

/System/Volumes/Data/export/content/data/toki-war/i001/qei-ltx1/indis/file_store

Step2

mint build && mint build-cfg -f qei-ltx1 && mint deploy -w toki-war -f qei-ltx1 

Step3:

check key logs could found in the following log to do regression check

Subscribe NODE resource /d2/clusters/NonExistentCluster
Sending NODE request for resources: [/d2/clusters/NonExistentCluster]
Initializing NODE /d2/clusters/NonExistentCluster to empty data.
Received response that NODE /d2/clusters/NonExistentCluster was removed
Failed to parse D2 cluster properties from xDS update. Cluster name: NonExistentCluster, Publishing null to event bus

TogglingPublisher: activating publisher INDIS store, deactivating publisher Unknown store
Subscribe NODE resource /d2/services/dataVaultAclChanges
Sending NODE request for resources: [/d2/services/dataVaultAclChanges]
xDS WarmUp fetching service data for service: dataVaultAclChanges
xDS WarmUp completed warming up service adCampaignsV2 in 2ms, completed 238/238
xDS WarmUp completed warming up 238 services in 27735ms
xDS WarmUp completed
New load balancer successfully started

For symlink
2024/03/15 17:01:06.268 WARN [XdsClientImpl] [Indis xDS client executor-4-1] [toki-war] [AAYTu9ItYSZGZmeiRnnSXw==] Received response that NODE /d2/uris/$EntitlementsBackendMaster was removed
2024/03/15 17:01:06.268 INFO [XdsClientImpl] [Indis xDS client executor-4-1] [toki-war] [AAYTu9ItYSZGZmeiRnnSXw==] Initializing NODE /d2/uris/$EntitlementsBackendMaster to empty data.
2024/03/15 17:01:06.268 INFO [XdsClientImpl] [Indis xDS client executor-4-1] [toki-war] [AAYTu9ItbDVpbBWVSv/UvA==] Subscribe NODE resource /d2/clusters/EntitlementsBackend-ei-ltx1
2024/03/15 17:01:06.268 INFO [XdsClientImpl] [Indis xDS client executor-4-1] [toki-war] [AAYTu9ItbDVpbBWVSv/UvA==] Sending NODE request for resources: [/d2/clusters/EntitlementsBackend-ei-ltx1]
2024/03/15 17:01:06.269 INFO [XdsClientImpl] [Indis xDS client executor-4-1] [toki-war] [AAYTu9ItcaazlQ6RvJMAxg==] Subscribe D2_URI_MAP resource /d2/uris/EntitlementsBackend-ei-ltx1
2024/03/15 17:01:06.269 INFO [XdsClientImpl] [Indis xDS client executor-4-1] [toki-war] [AAYTu9ItcaazlQ6RvJMAxg==] Sending D2_URI_MAP request for resources: [/d2/uris/EntitlementsBackend-ei-ltx1]

toki-war.log

Could get relative clusters

image
bohhyang commented 8 months ago

Please rebase the master branch for latest changes in XdsClientImpl and XdsToD2PropertiesAdaptor.

bohhyang commented 8 months ago

Btw, could you pls update these two dual read error level logs to warn level? It doesn't need to be error level which creates exception tickets to the owners. https://jarvis.corp.linkedin.com/codesearch/result/?name=DualReadLoadBalancer.java&path=rest.li%2Fd2%2Fsrc%2Fmain%2Fjava%2Fcom%2Flinkedin%2Fd2%2Fbalancer%2Fdualread&reponame=linkedin%2Frest.li#165

bohhyang commented 8 months ago

Btw, could you pls update these two dual read error level logs to warn level? It doesn't need to be error level which creates exception tickets to the owners. https://jarvis.corp.linkedin.com/codesearch/result/?name=DualReadLoadBalancer.java&path=rest.li%2Fd2%2Fsrc%2Fmain%2Fjava%2Fcom%2Flinkedin%2Fd2%2Fbalancer%2Fdualread&reponame=linkedin%2Frest.li#165

Nevermind, I'm changing this along with some other changes.

bohhyang commented 8 months ago

since the client requests for an non-existent cluster: /d2/clusters/NonExistentCluster. With your change, the log should show a msg like: "Received response that Cluster NonExistentCluster was removed". But it's not seen in the attached log file.

brycezhongqing commented 8 months ago

Received response that Cluster NonExistentCluster was removed

since the client requests for an non-existent cluster: /d2/clusters/NonExistentCluster. With your change, the log should show a msg like: "Received response that Cluster NonExistentCluster was removed". But it's not seen in the attached log file.

Hi, @bohhyang After rebase toki, it works now. Updated the log.