apache / incubator-heron

Apache Heron (Incubating) is a realtime, distributed, fault-tolerant stream processing engine from Twitter
https://heron.apache.org/
Apache License 2.0
3.65k stars 599 forks source link

Added to show the number of instances in the topology list UI. #3831

Closed thinker0 closed 2 years ago

thinker0 commented 2 years ago

Added to show the number of instances in the topology list UI.

It would be nice to have this on the topology list. It is necessary to determine the status of which topology and how many instances are being used.

nicknezis commented 2 years ago

I might have missed something, but got the following error when attempting a local run.

Installed locally: bazel run -- scripts/packages:heron-install.sh --user

I ran heron-tracker and heron-ui.

Ran a local topology: heron submit local ~/.heron/examples/heron-api-examples.jar org.apache.heron.examples.api.AckingTopology acking

Saw this error in the heron-tracker.

[2022-05-14 22:03:39 -0400] [INFO]: Adding new topology: acking, state_manager: local
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/local/Cellar/python@3.9/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/threading.py", line 973, in _bootstrap_inner
    self.run()
  File "/usr/local/Cellar/python@3.9/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/threading.py", line 910, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/nnezis/.pex/unzipped_pexes/bca9046b34d3961d3cc495fabd3817f4243df382/heron/statemgrs/src/python/filestatemanager.py", line 133, in monitor
    trigger_watches_based_on_files(
  File "/Users/nnezis/.pex/unzipped_pexes/bca9046b34d3961d3cc495fabd3817f4243df382/heron/statemgrs/src/python/filestatemanager.py", line 112, in trigger_watches_based_on_files
    callback(proto_object)
  File "/Users/nnezis/.pex/unzipped_pexes/bca9046b34d3961d3cc495fabd3817f4243df382/heron/tools/tracker/src/python/topology.py", line 626, in set_execution_state
    self._update(execution_state=execution_state)
  File "/Users/nnezis/.pex/unzipped_pexes/bca9046b34d3961d3cc495fabd3817f4243df382/heron/tools/tracker/src/python/topology.py", line 609, in _update
    info = self._rebuild_info(t_state)
  File "/Users/nnezis/.pex/unzipped_pexes/bca9046b34d3961d3cc495fabd3817f4243df382/heron/tools/tracker/src/python/topology.py", line 278, in _rebuild_info
    metadata=self._build_metadata(topology, physical_plan, execution_state, tracker_config),
  File "/Users/nnezis/.pex/unzipped_pexes/bca9046b34d3961d3cc495fabd3817f4243df382/heron/tools/tracker/src/python/topology.py", line 395, in _build_metadata
    "instances": len(physical_plan.instances),
AttributeError: 'NoneType' object has no attribute 'instances'

Is it possible that I'm not running the right binary of heron-tracker? I'll try to verify what I did, but wanted to post just in case before this got merged in.

nicknezis commented 2 years ago

I pulled the latest from the branch and it runs better. I see the instance count on the main heron page, but I don't see it when viewing the topology details.

image
thinker0 commented 2 years ago

@nicknezis Thanks

There seems to be a case where the object temporarily does not exist at the time of update. So I added it to check whether the object exists.

And I executed the component as below.

Tracker

bazel build heron/tools/tracker/src/python:heron-tracker && bazel-bin/heron/tools/tracker/src/python/heron-tracker --config-file ~/.heron/conf/heron_tracker.yaml --verbose

HeronUI

ln -sf ~/.heron/release.yaml /private/var/tmp/_bazel_thinker0/1db3221e319a1f6b3420bd0c6f4d8d08/execroot/org_apache_heron/bazel-out/darwin-fastbuild/bin/heron/tools/ui/src/release.yaml

bazel build heron/tools/ui/src/python:heron-ui --verbose_failures && ./bazel-bin/heron/tools/ui/src/python/heron-ui --tracker-url http://localhost:8888
nicknezis commented 2 years ago

When you click on one of those links to view the details of a specific job, do you see the instance count beneath the logical and physical plan diagrams?

thinker0 commented 2 years ago

yes.

nicknezis commented 2 years ago

yes.

The value is blank in your image. I see the instance count at the top, but not in the middle of the image.

thinker0 commented 2 years ago

The value is blank in your image. I see the instance count at the top, but not in the middle of the image.

It is because the tracker cannot calculate all metrics. This is a problem with topologies in many instances.

thinker0 commented 2 years ago

It may be pending because heron-tracker takes a long time to get the results of the stats of the metrics-manager of all instances and calculate it.

nicknezis commented 2 years ago

Interesting. I wonder if we should update the page to use the variable that is used to present the instance count at the top of the page. I tested locally with the AckingTopology and it was empty. But I did not wait long enough for the metrics gathering to complete. I'll take a look at the html to better understand the difference of the top instance count and the middle instance count.

nicknezis commented 2 years ago

This change fixed the missing instance count.