linkedin / rest.li

Rest.li is a REST+JSON framework for building robust, scalable service architectures using dynamic discovery and simple asynchronous APIs.
rest.li
Other
2.51k stars 546 forks source link

upgrade io.envoyproxy.controlplane module to 0.1.35 to fix compatibility issue #993

Closed brycezhongqing closed 7 months ago

brycezhongqing commented 7 months ago

Context

We found that for some samza mps(sitespeed-samza-beam and samza-pem), they are excluding the io.envoy module which will casued the INDIS xds stream initialization to fail.

exclude group: 'io.envoyproxy.controlplane', module: 'api'

Analysis

The problem is the low version io.envoyproxy.controlplane is shading instead of depending on an old version of opentelemetry-proto that’s not compatible with the lib that the team is using now. The compatability issue will cause the build failed, showing that there is no ScopeMetrics class.

> Task :samza-yarn-jobs:samza-pem-degradation-tracking:compileJava
/Users/sying/products/samza-pem/samza-yarn-jobs/samza-pem-degradation-tracking/src/main/java/com/linkedin/pem/functions/map/ProductAvailabilityPartialSessionScoreEventToOTelMetricsFn.java:226: error: cannot find symbol
        .addScopeMetrics(scopeMetrics)
        ^
  symbol:   method addScopeMetrics(io.opentelemetry.proto.metrics.v1.ScopeMetrics)
  location: class io.opentelemetry.proto.metrics.v1.ResourceMetrics.Builder

So after investigation, we found, now pegasus is using verion 0.1.31 for io.envoyproxy.controlplane. And the ScopeMetrics is added after 0.1.35.(java doc is here) image

Solution

As the above samza mp is not using pegasus directly, the best practice is to upgrade the io.envoyproxy.controlplane version in pegasus, then bump up the pegasus version in container. image

Local Test

https://lva1-app63607.corp.linkedin.com/s/x6fhh6h6o4amu/dependencies?dependencies=io.envoy&expandAll -> build successfully.

Find the INDIS successfully log

2024-03-26 13:51:05.954 [Indis xDS client executor-4-1] []  XdsClientImpl [INFO] ADS stream started, connected to server: main.indis-registry-observer.ei-ltx1.atd.disco.linkedin.com:32123
2024-03-26 13:51:06.659 [Indis xDS client executor-4-1] []  XdsClientImpl [INFO] ADS stream ready, cancelled timeout task: true
2024-03-26 13:51:08.428 [Indis xDS client executor-13-1] []  XdsClientImpl [INFO] ADS stream started, connected to server: main.indis-registry-observer.ei-ltx1.atd.disco.linkedin.com:32123
2024-03-26 13:51:08.709 [Indis xDS client executor-13-1] []  XdsClientImpl [INFO] ADS stream ready, cancelled timeout task: true
2024-03-26 13:51:54.335 [Indis xDS client executor-22-1] []  XdsClientImpl [INFO] ADS stream started, connected to server: main.indis-registry-observer.ei-ltx1.atd.disco.linkedin.com:32123
2024-03-26 13:51:54.665 [Indis xDS client executor-22-1] []  XdsClientImpl [INFO] ADS stream ready, cancelled timeout task: true
2024-03-26 13:51:55.803 [Indis xDS client executor-31-1] []  XdsClientImpl [INFO] ADS stream started, connected to server: main.indis-registry-observer.ei-ltx1.atd.disco.linkedin.com:32123
2024-03-26 13:51:56.031 [Indis xDS client executor-31-1] []  XdsClientImpl [INFO] ADS stream ready, cancelled timeout task: true
2024-03-26 13:51:06.712 [Indis xDS client executor-4-1] []  XdsClientImpl [INFO] Successfully established stream with ADS server: ltx1-app6591.stg.linkedin.com

Find the samza job successfully log

2024-03-26 14:01:32.524 [Samza StreamProcessor Container Thread-0] []  SubmitMetricsToAmfFn [INFO] Emit metrics to AMF
bohhyang commented 7 months ago

after build pass, how about deployment and INDIS read status after the fix? will they all succeed?

brycezhongqing commented 7 months ago

after build pass, how about deployment and INDIS read status after the fix? will they all succeed?

Yeah. Updated the log. INDIS has been initializaed successfully after the fix. And checked with the service owner, got the confirmation from the business side