insight-platform / Savant

Python Computer Vision & Video Analytics Framework With Batteries Included
https://savant-ai.io
Apache License 2.0
553 stars 45 forks source link

Pipeline Idle monitor #782

Closed bwsw closed 2 months ago

bwsw commented 3 months ago

Implement stuck state monitor allowing to monitor multiple buffer adapters through Prometheus endpoints and restart designated images when stuck is detected.

watch: 
  - buffer: buffer1:1111, 
    queue:
      action: stop
      length: 200
      ignore_after_restart: 60s
      restart: # any 
        - labels: [x,y,z] # all
        - labels: c
    egress:
      action: restart
      restart_cooldown: 60s
      idle: 100s
      restart:
        - labels: x
        - labels: y
    ingress:
      action: restart
      restart_cooldown: 60s
      idle: 50s
      restart:
        - label=z
  - buffer: buffer2:2222    
    ...
bwsw commented 2 months ago

Implement in https://github.com/insight-platform/PipelineWatchdog License is Apache 2.0