feat: configurable queue monitors for event buffers and thread pools

Closes #439

Summary

See updated changelog file and kytos.conf.template for more specific information
The default config should works well out of the gate for our current kytos-ng NApps and AmLight's scalability network like (I simulated with 300 EVCs + a few link flaps)
Notice that with queue monitors the major goal is to detect high queuing usage over a delta t in seconds sampled each second, so we're not trying to have extremely granular visibility (telemetry like), but just to start detecting when on a per second scale if any queues of event buffers or the max workers of thread pools need to either be increased or if a NApp might be misbehaving and sending way too many events.

Local Tests

I tested the default config with 300 EVCs with some link flap and no warnings showed up as expected
I also explored three configs that will be described below, while also injecting a hundreds of concurrent events targeting a slow-ish handler to simulate a case where the queue of a thread pool would keep increasing significantly

Config a (default)

2024-02-08 10:27:48,058 - INFO [kytos.core.controller] (MainThread) Starting QueueMonitor(msg_in, min_hits=5, min_size=512, delta_secs=5)...
2024-02-08 10:27:48,058 - INFO [kytos.core.controller] (MainThread) Starting QueueMonitor(msg_out, min_hits=5, min_size=512, delta_secs=5)...
2024-02-08 10:27:48,058 - INFO [kytos.core.controller] (MainThread) Starting QueueMonitor(raw, min_hits=5, min_size=512, delta_secs=5)...
2024-02-08 10:27:48,058 - INFO [kytos.core.controller] (MainThread) Starting QueueMonitor(app, min_hits=5, min_size=1024, delta_secs=5)...
2024-02-08 10:27:48,058 - INFO [kytos.core.controller] (MainThread) Starting QueueMonitor(threadpool_sb, min_hits=5, min_size=256, delta_secs=5)...
2024-02-08 10:27:48,058 - INFO [kytos.core.controller] (MainThread) Starting QueueMonitor(threadpool_app, min_hits=5, min_size=512, delta_secs=5)...
2024-02-08 10:27:48,058 - INFO [kytos.core.controller] (MainThread) Starting QueueMonitor(threadpool_db, min_hits=5, min_size=256, delta_secs=5)...

... after injecting too many events ... 

2024-02-08 10:29:34,182 - WARNING [kytos.core.queue_monitor] (MainThread) threadpool_app, counted: 5, min/avg/max size: 4608/10336.6/12958, first at: 2024-02-08 13:29:29.802628+00:00, last at: 2024-02-08 13:29:34.182469+00:00, delta secs: 5, min_hits: 5, min_size_threshold: 512
2024-02-08 10:29:39,188 - WARNING [kytos.core.queue_monitor] (MainThread) threadpool_app, counted: 5, min/avg/max size: 9392/10410.0/11428, first at: 2024-02-08 13:29:35.184075+00:00, last at: 2024-02-08 13:29:39.188669+00:00, delta secs: 5, min_hits: 5, min_size_threshold: 512
2024-02-08 10:29:44,195 - WARNING [kytos.core.queue_monitor] (MainThread) threadpool_app, counted: 5, min/avg/max size: 6838/7859.6/8880, first at: 2024-02-08 13:29:40.189934+00:00, last at: 2024-02-08 13:29:44.195025+00:00, delta secs: 5, min_hits: 5, min_size_threshold: 512

Config b

thread_pool_queue_monitors =
  [
    {
      "min_hits": 5,
      "delta_secs": 10,
      "min_queue_full_percent": 100,
      "log_at_most_n": 3,
      "queues": ["sb", "app", "db"]
    }
  ]

2024-02-08 10:26:03,160 - INFO [kytos.core.controller] (MainThread) Starting QueueMonitor(msg_in, min_hits=5, min_size=512, delta_secs=5)...
2024-02-08 10:26:03,160 - INFO [kytos.core.controller] (MainThread) Starting QueueMonitor(msg_out, min_hits=5, min_size=512, delta_secs=5)...
2024-02-08 10:26:03,160 - INFO [kytos.core.controller] (MainThread) Starting QueueMonitor(raw, min_hits=5, min_size=512, delta_secs=5)...
2024-02-08 10:26:03,160 - INFO [kytos.core.controller] (MainThread) Starting QueueMonitor(app, min_hits=5, min_size=1024, delta_secs=5)...
2024-02-08 10:26:03,160 - INFO [kytos.core.controller] (MainThread) Starting QueueMonitor(threadpool_sb, min_hits=5, min_size=256, delta_secs=10)...
2024-02-08 10:26:03,160 - INFO [kytos.core.controller] (MainThread) Starting QueueMonitor(threadpool_app, min_hits=5, min_size=512, delta_secs=10)...
2024-02-08 10:26:03,160 - INFO [kytos.core.controller] (MainThread) Starting QueueMonitor(threadpool_db, min_hits=5, min_size=256, delta_secs=10)...

... after injecting too many events ... 

2024-02-08 10:26:40,911 - WARNING [kytos.core.queue_monitor] (MainThread) threadpool_app, counted: 5, min/avg/max size: 2560/10130.0/13470, first at: 2024-02-08 13:26:36.561493+00:00, last at: 2024-02-08 13:26:40.911422+00:00, delta secs: 10, min_hits: 5, min_size_threshold: 512
2024-02-08 10:26:40,911 - WARNING [kytos.core.queue_monitor] (MainThread) threadpool_app, record[0]/[5]: size: 2560, at: 2024-02-08 13:26:36.561493+00:00
2024-02-08 10:26:40,911 - WARNING [kytos.core.queue_monitor] (MainThread) threadpool_app, record[2]/[5]: size: 13470, at: 2024-02-08 13:26:38.909429+00:00
2024-02-08 10:26:40,911 - WARNING [kytos.core.queue_monitor] (MainThread) threadpool_app, record[4]/[5]: size: 12446, at: 2024-02-08 13:26:40.911422+00:00

2024-02-08 10:26:50,927 - WARNING [kytos.core.queue_monitor] (MainThread) threadpool_app, counted: 10, min/avg/max size: 7350/9643.2/11940, first at: 2024-02-08 13:26:41.912263+00:00, last at: 2024-02-08 13:26:50.926982+00:00, delta secs: 10, min_hits: 5, min_size_threshold: 512
2024-02-08 10:26:50,927 - WARNING [kytos.core.queue_monitor] (MainThread) threadpool_app, record[0]/[10]: size: 11940, at: 2024-02-08 13:26:41.912263+00:00
2024-02-08 10:26:50,927 - WARNING [kytos.core.queue_monitor] (MainThread) threadpool_app, record[4]/[10]: size: 9898, at: 2024-02-08 13:26:45.919824+00:00
2024-02-08 10:26:50,927 - WARNING [kytos.core.queue_monitor] (MainThread) threadpool_app, record[8]/[10]: size: 7856, at: 2024-02-08 13:26:49.925610+00:00

Config c

This (temporary) config can be useful when you just want to see if every second if there's at least 1 event being queued, this can be useful to give you an idea of how busy the queues are in a local stress test scenario for instance, which can help you to identify certain base line usage and/or spiky queue usage loads in a particular case:

thread_pool_queue_monitors =
  [
    {
      "min_hits": 1,
      "delta_secs": 1,
      "min_queue_full_percent": 0,
      "log_at_most_n": 0,
      "queues": ["sb", "app", "db"]
    }
  ]


2024-02-08 10:34:10,705 - WARNING [kytos.core.queue_monitor] (MainThread) threadpool_app, counted: 1, min/avg/max size: 2488/2488.0/2488, first at: 2024-02-08 13:34:10.705708+00:00, last at: 2024-02-08 13:34:10.705708+00:00, delta secs: 1, min_hits: 1, min_size_threshold: 1
2024-02-08 10:34:11,707 - WARNING [kytos.core.queue_monitor] (MainThread) threadpool_app, counted: 1, min/avg/max size: 1976/1976.0/1976, first at: 2024-02-08 13:34:11.707078+00:00, last at: 2024-02-08 13:34:11.707078+00:00, delta secs: 1, min_hits: 1, min_size_threshold: 1
2024-02-08 10:34:12,710 - WARNING [kytos.core.queue_monitor] (MainThread) threadpool_app, counted: 1, min/avg/max size: 1464/1464.0/1464, first at: 2024-02-08 13:34:12.709970+00:00, last at: 2024-02-08 13:34:12.709970+00:00, delta secs: 1, min_hits: 1, min_size_threshold: 1
2024-02-08 10:34:13,711 - WARNING [kytos.core.queue_monitor] (MainThread) threadpool_app, counted: 1, min/avg/max size: 958/958.0/958, first at: 2024-02-08 13:34:13.711444+00:00, last at: 2024-02-08 13:34:13.711444+00:00, delta secs: 1, min_hits: 1, min_size_threshold: 1

End-to-End Tests

============================= test session starts ==============================
platform linux -- Python 3.9.2, pytest-7.2.0, pluggy-1.4.0
rootdir: /builds/amlight/kytos-end-to-end-tester/kytos-end-to-end-tests
plugins: rerunfailures-10.2, timeout-2.1.0, anyio-3.6.2
collected 257 items
tests/test_e2e_01_kytos_startup.py ..                                    [  0%]
tests/test_e2e_05_topology.py ....................                       [  8%]
tests/test_e2e_10_mef_eline.py ..........ss.....x.....x................  [ 24%]
tests/test_e2e_11_mef_eline.py ......                                    [ 26%]
tests/test_e2e_12_mef_eline.py .....Xx.                                  [ 29%]
tests/test_e2e_13_mef_eline.py ....Xs.s.....Xs.s.XXxX.xxxx..X........... [ 45%]
.                                                                        [ 45%]
tests/test_e2e_14_mef_eline.py x                                         [ 46%]
tests/test_e2e_15_mef_eline.py .....                                     [ 48%]
tests/test_e2e_16_mef_eline.py .                                         [ 48%]
tests/test_e2e_20_flow_manager.py .....................                  [ 56%]
tests/test_e2e_21_flow_manager.py ...                                    [ 57%]
tests/test_e2e_22_flow_manager.py ...............                        [ 63%]
tests/test_e2e_23_flow_manager.py ..............                         [ 69%]
tests/test_e2e_30_of_lldp.py ....                                        [ 70%]
tests/test_e2e_31_of_lldp.py ...                                         [ 71%]
tests/test_e2e_32_of_lldp.py ...                                         [ 73%]
tests/test_e2e_40_sdntrace.py ..............                             [ 78%]
tests/test_e2e_41_kytos_auth.py ........                                 [ 81%]
tests/test_e2e_42_sdntrace.py ..                                         [ 82%]
tests/test_e2e_50_maintenance.py ........................                [ 91%]
tests/test_e2e_60_of_multi_table.py .....                                [ 93%]
tests/test_e2e_70_kytos_stats.py ........                                [ 96%]
tests/test_e2e_80_pathfinder.py ss......                                 [100%]
=============================== warnings summary ===============================
------------------------------- start/stop times -------------------------------
= 233 passed, 8 skipped, 9 xfailed, 7 xpassed, 1143 warnings in 12325.92s (3:25:25) =

kytos-ng / kytos

feat: configurable queue monitors for event buffers and thread pools #450