orcasound / orcanode-monitor

Web service for monitoring liveness of orcanode audio streaming
MIT License
0 stars 1 forks source link

Any way to get pulse of inference containers on Azure? #19

Open dthaler opened 1 month ago

dthaler commented 1 month ago

Any way to get pulse of inference containers on Azure (ask Patrick? Or Michelle on Github?)

dthaler commented 3 weeks ago

@pastorep @micya Scott suggested pinging you two. Any way to tell what nodes OrcaHello is actively monitoring? The status endpoint is not node specific, and the detections endpoint doesn't give any information on what is actively being monitored.

micya commented 3 weeks ago

Any way to get pulse of inference containers on Azure (ask Patrick? Or Michelle on Github?)

Inference system is running on the AKS cluster inference-system-AKS in resource group named LiveSRKWNotificationSystem. https://github.com/orcasound/aifororcas-livesystem/tree/main/InferenceSystem#deploying-an-updated-docker-build-to-azure-kubernetes-service is still accurate. Each location has its own namespace.

@pastorep @micya Scott suggested pinging you two. Any way to tell what nodes OrcaHello is actively monitoring? The status endpoint is not node specific, and the detections endpoint doesn't give any information on what is actively being monitored.

Not programmatically. Currently, we have a separate container image per location (refer to yaml files here) . The code in all images are the same, but there is a different config file in each image (config files here). I'm not sure which config goes to which image, but you could probably poke through the images in Azure Container Registry to figure it out.

Suggestions:

  1. Unify the docker images into one and inject per-location config at runtime.
  2. Consider having each location deployment report its own location. This can be an endpoint that returns a string (either in the inference service or as a sidecar).
  3. Consider having each location deployment send a heartbeat. A monitoring solution could just subscribe to the heartbeats to figure out which location is active. The heartbeat might also just send the location string.
  4. You may also consider a more comprehensive kubernetes cluster monitoring solution. But since our usage is fairly simple and our cluster is like cattle and not pets, I suggest skipping this in favor of the heartbeat system proposed above.