Devolutions / devolutions-gateway

A blazing fast relay server adaptable to different protocols and desired levels of traffic inspection.
Apache License 2.0
69 stars 13 forks source link

fix(jetsocat,dgw): add backpressure in JMUX proxy #1008

Closed CBenoit closed 2 months ago

CBenoit commented 2 months ago

The memory consumption of the JMUX proxy was unbounded because we used an unbounded mpsc channel for message passing.

Here is a jetsocat-bench.nu run against master:

2024-09-05T22:05:24.786|INF|=============
2024-09-05T22:05:24.786|INF|=== master ===
2024-09-05T22:05:24.786|INF|=============
2024-09-05T22:05:28.961|INF|==> Enabled delay of 50msec
2024-09-05T22:05:28.962|INF|==> 1 connection
╭────────────────────────────┬──────────────────╮
│ name                       │ master.delay.001 │
│ parallel_connections       │ 1                │
│ average_throughput_per_sec │ 27.1 MiB         │
│ server_node_peak_used_mem  │ 14.4 MiB         │
│ client_node_peak_used_mem  │ 23.6 MiB         │
╰────────────────────────────┴──────────────────╯
2024-09-05T22:10:35.622|INF|==> 2 connections
╭────────────────────────────┬──────────────────╮
│ name                       │ master.delay.002 │
│ parallel_connections       │ 2                │
│ average_throughput_per_sec │ 27.1 MiB         │
│ server_node_peak_used_mem  │ 13.9 MiB         │
│ client_node_peak_used_mem  │ 234.9 MiB        │
╰────────────────────────────┴──────────────────╯
2024-09-05T22:15:47.165|INF|==> 10 connections
╭────────────────────────────┬──────────────────╮
│ name                       │ master.delay.010 │
│ parallel_connections       │ 10               │
│ average_throughput_per_sec │ 27.3 MiB         │
│ server_node_peak_used_mem  │ 12.8 MiB         │
│ client_node_peak_used_mem  │ 928.2 MiB        │
╰────────────────────────────┴──────────────────╯
2024-09-05T22:21:19.878|INF|==> Disabled delay
2024-09-05T22:21:19.879|INF|==> 1 connection
╭────────────────────────────┬────────────────────╮
│ name                       │ master.nodelay.001 │
│ parallel_connections       │ 1                  │
│ average_throughput_per_sec │ 2.2 GiB            │
│ server_node_peak_used_mem  │ 1.1 GiB            │
│ client_node_peak_used_mem  │ 220.5 MiB          │
╰────────────────────────────┴────────────────────╯
2024-09-05T22:26:25.997|INF|==> 2 connections
╭────────────────────────────┬────────────────────╮
│ name                       │ master.nodelay.002 │
│ parallel_connections       │ 2                  │
│ average_throughput_per_sec │ 2.1 GiB            │
│ server_node_peak_used_mem  │ 43.3 MiB           │
│ client_node_peak_used_mem  │ 394.3 MiB          │
╰────────────────────────────┴────────────────────╯
2024-09-05T22:31:32.178|INF|==> 10 connections
╭────────────────────────────┬────────────────────╮
│ name                       │ master.nodelay.010 │
│ parallel_connections       │ 10                 │
│ average_throughput_per_sec │ 1.8 GiB            │
│ server_node_peak_used_mem  │ 26.9 MiB           │
│ client_node_peak_used_mem  │ 859.5 MiB          │
╰────────────────────────────┴────────────────────╯

Notice how the peak memory usage can reach almost 1 GB. In some degenerate cases, it could even cause OOM.

Run again against this patch:

2024-09-05T23:38:42.712|INF|=============
2024-09-05T23:38:42.713|INF|=== perf/jmux-proxy-9 ===
2024-09-05T23:38:42.713|INF|=============
2024-09-05T23:38:48.059|INF|==> Enabled delay of 50msec
2024-09-05T23:38:48.060|INF|==> 1 connection
╭────────────────────────────┬─────────────────────────────╮
│ name                       │ perf-jmux-proxy-9.delay.001 │
│ parallel_connections       │ 1                           │
│ average_throughput_per_sec │ 27.1 MiB                    │
│ server_node_peak_used_mem  │ 13.7 MiB                    │
│ client_node_peak_used_mem  │ 14.2 MiB                    │
╰────────────────────────────┴─────────────────────────────╯
2024-09-05T23:43:54.757|INF|==> 2 connections
╭────────────────────────────┬─────────────────────────────╮
│ name                       │ perf-jmux-proxy-9.delay.002 │
│ parallel_connections       │ 2                           │
│ average_throughput_per_sec │ 27.3 MiB                    │
│ server_node_peak_used_mem  │ 15.3 MiB                    │
│ client_node_peak_used_mem  │ 16.2 MiB                    │
╰────────────────────────────┴─────────────────────────────╯
2024-09-05T23:49:01.750|INF|==> 10 connections
╭────────────────────────────┬─────────────────────────────╮
│ name                       │ perf-jmux-proxy-9.delay.010 │
│ parallel_connections       │ 10                          │
│ average_throughput_per_sec │ 27.6 MiB                    │
│ server_node_peak_used_mem  │ 14.6 MiB                    │
│ client_node_peak_used_mem  │ 14.2 MiB                    │
╰────────────────────────────┴─────────────────────────────╯
2024-09-05T23:54:11.464|INF|==> Disabled delay
2024-09-05T23:54:11.464|INF|==> 1 connection
╭────────────────────────────┬───────────────────────────────╮
│ name                       │ perf-jmux-proxy-9.nodelay.001 │
│ parallel_connections       │ 1                             │
│ average_throughput_per_sec │ 1.9 GiB                       │
│ server_node_peak_used_mem  │ 20.6 MiB                      │
│ client_node_peak_used_mem  │ 18.1 MiB                      │
╰────────────────────────────┴───────────────────────────────╯
2024-09-05T23:59:17.620|INF|==> 2 connections
╭────────────────────────────┬───────────────────────────────╮
│ name                       │ perf-jmux-proxy-9.nodelay.002 │
│ parallel_connections       │ 2                             │
│ average_throughput_per_sec │ 2.0 GiB                       │
│ server_node_peak_used_mem  │ 18.1 MiB                      │
│ client_node_peak_used_mem  │ 18.2 MiB                      │
╰────────────────────────────┴───────────────────────────────╯
2024-09-06T00:04:23.752|INF|==> 10 connections
╭────────────────────────────┬───────────────────────────────╮
│ name                       │ perf-jmux-proxy-9.nodelay.010 │
│ parallel_connections       │ 10                            │
│ average_throughput_per_sec │ 1.5 GiB                       │
│ server_node_peak_used_mem  │ 16.3 MiB                      │
│ client_node_peak_used_mem  │ 20.8 MiB                      │
╰────────────────────────────┴───────────────────────────────╯

The peak memory usage is now under control. The throughput is lower in the nodelay cases, but not by too much. No delay networks are also not our main use cases.