Investigate and resolve intermittent high CPU usage in AppGate (including Gateway) and RelayMiner causing unresponsiveness and E2E test disruptions.
Origin Document
This issue has been observed during E2E testing, where AppGate and RelayMiner occasionally become unresponsive and consume 100% of their allocated CPU resources. This behavior frequently disrupts our E2E tests.
Related to #621
The pprof snapshots are included in the comment below.
Goals
Identify the root cause of the high CPU usage in AppGate and RelayMiner
Implement a solution to prevent or mitigate the unresponsiveness issue
Improve the stability and reliability of our E2E testing environment
Deliverables
[ ] Evaluate the pprof snapshots
[ ] If necessary, add more debug logging output and metrics to help catch the issue
[ ] Merge in a fix and monitor for this behavior in furute
General deliverables
[ ] Comments: Add/update TODOs and comments alongside the source code so it is easier to follow.
[ ] Testing: Add new tests (unit and/or E2E) to the test suite.
[ ] Makefile: Add new targets to the Makefile to make the new functionality easier to use.
[ ] Documentation: Update architectural or development READMEs; use mermaid diagrams where appropriate.
Objective
Investigate and resolve intermittent high CPU usage in AppGate (including Gateway) and RelayMiner causing unresponsiveness and E2E test disruptions.
Origin Document
This issue has been observed during E2E testing, where AppGate and RelayMiner occasionally become unresponsive and consume 100% of their allocated CPU resources. This behavior frequently disrupts our E2E tests.
Related to #621
The pprof snapshots are included in the comment below.
Goals
Deliverables
General deliverables
Creator: @okdas Co-Owners: @red-0ne