apache / incubator-heron

Apache Heron (Incubating) is a realtime, distributed, fault-tolerant stream processing engine from Twitter
https://heron.apache.org/
Apache License 2.0
3.65k stars 598 forks source link

heron-tracker supports two APIs #3817

Closed thinker0 closed 2 years ago

thinker0 commented 2 years ago

Describe the bug HeathManager requires backward compatibility between the old version and the new version.

# version-0.20.4
@router.get("/metricstimeline", response_model=metricstimeline.MetricsTimeline)
# version-0.20.5
@router.get("/metrics/timeline", response_model=metricstimeline.MetricsTimeline)

To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior Since I use both versions, there is a problem that HealthManager dies when the topology of the lower version calls the tracker of the upper version.

Screenshots

[2022-04-12 21:43:52 +0900] [INFO] org.apache.heron.healthmgr.sensors.TrackerMetricsProvider: Did not get any metrics from tracker for __stmgr__:__time_spent_back_pressure_by_compid/container_23_kafka-publisher_323   
[2022-04-12 21:43:52 +0900] [INFO] org.apache.heron.healthmgr.sensors.TrackerMetricsProvider: Did not get any metrics from tracker for __stmgr__:__time_spent_back_pressure_by_compid/container_15_kafka-publisher_315   
[2022-04-12 21:43:52 +0900] [INFO] org.apache.heron.healthmgr.sensors.TrackerMetricsProvider: Did not get any metrics from tracker for __stmgr__:__time_spent_back_pressure_by_compid/container_8_kafka-publisher_308   
[2022-04-12 21:43:52 +0900] [INFO] org.apache.heron.healthmgr.sensors.TrackerMetricsProvider: Did not get any metrics from tracker for __stmgr__:__time_spent_back_pressure_by_compid/container_25_kafka-publisher_325   
[2022-04-12 21:43:52 +0900] [INFO] org.apache.heron.healthmgr.sensors.TrackerMetricsProvider: Did not get any metrics from tracker for __stmgr__:__time_spent_back_pressure_by_compid/container_5_kafka-publisher_305  
[2022-04-12 21:43:52 +0900] [INFO] org.apache.heron.healthmgr.sensors.TrackerMetricsProvider: Did not get any metrics from tracker for __stmgr__:__time_spent_back_pressure_by_compid/container_7_kafka-publisher_307   
[2022-04-12 21:43:52 +0900] [INFO] org.apache.heron.common.network.SocketChannelHelper: Forcing to flush data to socket with best effort.  
[2022-04-12 21:43:52 +0900] [INFO] org.apache.heron.common.network.HeronClient: To stop the HeronClient.  
[2022-04-12 21:43:52 +0900] [INFO] org.apache.heron.healthmgr.HealthManagerMetrics: SimpleMetricsManagerClient exits  
[2022-04-12 21:43:52 +0900] [STDERR] stderr: Exception in thread "main"   
[2022-04-12 21:43:52 +0900] [STDERR] stderr: java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: json string can not be null or empty
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at org.apache.heron.healthmgr.HealthManager.main(HealthManager.java:242)
[2022-04-12 21:43:52 +0900] [STDERR] stderr: Caused by: java.lang.IllegalArgumentException: json string can not be null or empty
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at com.jayway.jsonpath.internal.Utils.notEmpty(Utils.java:386)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at com.jayway.jsonpath.internal.JsonContext.parse(JsonContext.java:81)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at com.jayway.jsonpath.JsonPath.parse(JsonPath.java:596)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at org.apache.heron.healthmgr.sensors.TrackerMetricsProvider.parse(TrackerMetricsProvider.java:94)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at org.apache.heron.healthmgr.sensors.TrackerMetricsProvider.getMeasurements(TrackerMetricsProvider.java:81)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at com.microsoft.dhalion.api.MetricsProvider.getMeasurements(MetricsProvider.java:59)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at org.apache.heron.healthmgr.sensors.BackPressureSensor.fetch(BackPressureSensor.java:83)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1625)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at com.microsoft.dhalion.policy.HealthPolicyImpl.executeSensors(HealthPolicyImpl.java:115)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at com.microsoft.dhalion.policy.PoliciesExecutor.lambda$start$2(PoliciesExecutor.java:81)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
[2022-04-12 21:43:52 +0900] [STDERR] stderr:    at java.base/java.lang.Thread.run(Thread.java:833)

Operating System

Additional context If tracker supports two APIs, it should be done.