Open SsuiyueL opened 2 months ago
maybe, can you try to increase upstream_keepalive_pool_size
to 1000
? , and set tcp_keepalive
in peer options?
maybe, can you try to increase
upstream_keepalive_pool_size
to1000
? , and settcp_keepalive
in peer options?
Thank you for your response! I seem to have discovered some issues:
Initially, my server was configured for short connections (with keepalive_timeout set to 0), and under those conditions, Pingora did not perform well. Later, I tested the server with long connections, and Pingora demonstrated its advantages. I also tested the configuration changes as you suggested. The detailed results are as follows:
Nginx test results are as follows:
Thread Stats Avg Stdev Max +/- Stdev
Latency 260.76ms 434.25ms 7.20s 84.93%
Req/Sec 3.07k 1.20k 7.16k 73.84%
909551 requests in 30.02s, 4.30GB read
Requests/sec: 30296.15
Transfer/sec: 146.75MB
cpu: 49%
The previous Pingora test results are as follows:
Thread Stats Avg Stdev Max +/- Stdev
Latency 98.75ms 190.03ms 3.43s 90.47%
Req/Sec 4.95k 1.34k 11.83k 74.45%
1475976 requests in 30.03s, 6.97GB read
Requests/sec: 49156.43
Transfer/sec: 237.83MB
cpu: 80%, In each test, the memory still increases irreversibly.
The improved Pingora test results are as follows:
Thread Stats Avg Stdev Max +/- Stdev
Latency 72.02ms 126.64ms 3.20s 88.49%
Req/Sec 5.15k 1.39k 11.51k 73.82%
1534099 requests in 30.10s, 7.25GB read
Requests/sec: 50968.27
Transfer/sec: 246.61MB
In summary, thanks for the response; it has resolved some of my issues. However, the memory increase and other problems still persist. I will continue to monitor this.
Hey! I've been trying to debug ever-increasing memory utilization in our Pingora proxy service (HTTP proxy with TLS and h2), which is described in this issue, and similarly in this: https://github.com/cloudflare/pingora/issues/447, which indicates other Pingora users have similar problems.
I can easily reproduce the issue with k6 load tests, and we can see that at the start of the test memory utilization increases quickly. Then, hours after the test, the memory utilization remains high. It keeps growing indefinitely until the service goes OOM, or until we restart it. In the below image you can see the load test run for 5 minutes at ̃20:00. This is on an AWS ECS Fargate service, with 0.5 vCPU and 1GB memory.
First I tried to see if we had written any memory leaks in our code, but if we do, I haven't been able to find it. I've tried using valgrind memcheck with leak detection, as well as valgrind massif for heap profiling.
Then I tried to figure out if there was some connection pool in Pingora that was ever-growing. The service is behind an AWS network load balancer, and we can see in its metrics that the downstream connections are not held open, so I don't believe that is the cause. I tried to disable the upstream connection pool as instructed here: https://github.com/cloudflare/pingora/blob/main/docs/user_guide/pooling.md, but the default size for that pool is 128, so it doesn't make sense that it would be ever-growing and enough to drive the service OOM. And after disabling it and re-running the test, it did not resolve the issue of ever-growing memory.
To summarize, I realize this is most likely an error on our end, since I know you run Pingora in production yourselves, and I assume you don't have this problem. However, perhaps you have seen this behavior before? Do you have any recommendations for what config I might tweak to resolve it? Any advice is highly appreciated, but I fully understand if you don't have time to help me with this. I'll tag you for visibility @drcaramelsyrup, apologies in advance!
If you have time to take a look, here is our setup code:
Try using tikv-jemallocator. It helped me reduce memory usage growth in cases involving a large number of new upstream connections. I think this improvement is related to reduced memory fragmentation.
Hello, I encountered some issues while conducting performance testing. I reviewed previous issues, but they did not resolve my problem. Could you please help me with a detailed explanation? I would greatly appreciate it.
I have implemented a simple HTTP proxy using Nginx (OpenResty and Nginx-Rust) and Pingora. Below is the code I have implemented based on the example [modify_response]:
config:
My testing was conducted on an Ubuntu system with 8 cores and 16 GB of MEM. Nginx started 8 worker processes.
1. Using wrk for testing:
wrk -t10 -c1000 -d30s http://172.24.1.2:6191
The result of Nginx:
The total CPU usage is around 50%, and the memory usage of each worker can be ignored.
The result of Pingora:
The total CPU usage is around 70%, and the memory usage increases by 0.3% after each test (0->0.9->1.2).
Q1: In terms of throughput, Nginx performs slightly better than Pingora, while Pingora shows slightly lower latency compared to Nginx. (Isn't that a bit strange?) Overall, the overall conclusion is that the differences between the two are not significant. Does this align with your expectations?
Q2: In terms of CPU usage, the overhead of Pingora is significantly greater than that of Nginx. Is this in line with your expectations? Regarding memory, I’ve noticed that memory usage increases after each test and does not recover. Could this indicate a memory leak?
2. Using ab for testing:
ab -n 10000 -c 100 http://172.24.1.2:6191/
When I perform testing with ab, Pingora times out:
The packet capture analysis is as follows:
It can be seen that a GET request was sent at the beginning, but Pingora did not return a response.
Nginx can be tested normally using the same command, and the packet capture shows that it responded properly.
ab is using HTTP/1.0, but after verification, this is not the cause of the problem.
Additionally, I also used Siege for testing, and the results were similar to those obtained with wrk.
3. Summary
Pingora is a remarkable project, and I’m very interested in its potential improvements over Nginx. However, I would like to know:
Am I missing any configurations, or how can I improve it to enhance performance and reduce CPU and memory usage?
Is it unfair to compare Pingora with Nginx in this simple scenario? In other words, is Pingora's advantage more apparent in more complex scenarios? (If so, I will use Pingora in more complex scenarios.)
I really appreciate your support.