Closed agerchev closed 1 year ago
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.
Tagging subscribers to this area: @dotnet/ncl See info in area-owners.md if you want to be subscribed.
Author: | agerchev |
---|---|
Assignees: | - |
Labels: | `area-System.Net.Http`, `tenet-performance`, `untriaged` |
Milestone: | - |
The threadpool count seems suspicious. Can you share the source code for the performance tool?
I have dumped the call stacks from the test with https and .net 7
Here is the tool.
class Program
{
private static ConcurrentDictionary<HttpStatusCode, MetricdData> _metricData = new ConcurrentDictionary<HttpStatusCode, MetricdData>();
/// <summary>
///
/// </summary>
/// <param name="numberOfUsers">Брой на потребителите</param>
/// <param name="numberOfIterations">Брой на итерациите, които всеки потребител прави</param>
/// <param name="url">Адрес</param>
/// <param name="defaultRequestVersion">версия на Http протокола по подразбиране - 1.0/1.1/2.0/3.0/</param>
/// <param name="header">Хедър, който се добавя при всяка заявка. Могат да се добавят няколко.</param>
static void Main(int numberOfUsers = 1, int numberOfIterations = 1, string url = null, string defaultRequestVersion = null, string[] header = null)
{
try
{
Program p = new Program();
p.RunPerformanceTest(numberOfUsers, numberOfIterations, url, string.IsNullOrEmpty(defaultRequestVersion) ? null : Version.Parse(defaultRequestVersion), header).Wait();
}
catch (Exception ex)
{
var exCurrent = ex;
do
{
Console.WriteLine(exCurrent.Message);
}
while ((exCurrent = exCurrent.InnerException) != null);
}
}
public async Task RunPerformanceTest(int numberOfUsers, int numberOfIterations, string url, Version defaultRequestVersion, string[] header)
{
List<Task> pendingWork = new List<Task>();
SocketsHttpHandler handler = new SocketsHttpHandler();
handler.PooledConnectionLifetime = TimeSpan.FromMinutes(30);
handler.PooledConnectionIdleTimeout = TimeSpan.FromMinutes(30);
handler.EnableMultipleHttp2Connections = true;
handler.MaxConnectionsPerServer = 1024;
handler.SslOptions.RemoteCertificateValidationCallback = (sender, certificate, chain, sslPolicyErrors) => { return true; };
HttpClient httpClient = new HttpClient(handler);
if (defaultRequestVersion != null)
httpClient.DefaultRequestVersion = defaultRequestVersion;
NameValueHeaderValue[] headers = new NameValueHeaderValue[header?.Length ?? 0];
for (int i = 0; i < header?.Length; i++)
{
headers[i] = NameValueHeaderValue.Parse(header[i]);
}
Stopwatch stopwatch = Stopwatch.StartNew();
for (int users = 0; users < numberOfUsers; users++)
{
pendingWork.Add(Task.Run(async () =>
{
var buffer = new byte[32 * 1024];
for (var iter = 0; iter < numberOfIterations; iter++)
{
var stopwatchLocal = Stopwatch.StartNew();
using (var httprequest = new HttpRequestMessage(HttpMethod.Get, new Uri(url)) { Version = httpClient.DefaultRequestVersion, VersionPolicy = httpClient.DefaultVersionPolicy })
{
foreach (var header in headers)
httprequest.Headers.Add(header.Name, header.Value);
using (var response = await httpClient.SendAsync(httprequest, HttpCompletionOption.ResponseHeadersRead))
{
using (var stream = await response.Content.ReadAsStreamAsync())
{
while ((await stream.ReadAsync(buffer, 0, buffer.Length)) > 0) ;
}
var metricData = _metricData.GetOrAdd(response.StatusCode, (statusCode) => { return new MetricdData(); });
Interlocked.Increment(ref metricData.NumberOfRequests);
Interlocked.Add(ref metricData.AvgLatency, stopwatchLocal.ElapsedMilliseconds);
}
}
}
}));
}
var monitoringTask = StartMonitoring();
await Task.WhenAll(pendingWork.ToArray());
stopwatch.Stop();
Console.WriteLine("{0} iterations Completed in {1}. {2}r/s", numberOfIterations * numberOfUsers, stopwatch.Elapsed, (numberOfIterations * numberOfUsers) / stopwatch.Elapsed.TotalSeconds);
}
public async Task StartMonitoring()
{
StringBuilder report = new StringBuilder();
while (true)
{
try
{
report.Clear();
foreach (var metric in _metricData.OrderBy((kvp) => kvp.Key))
{
var numberOfRequests = Interlocked.Exchange(ref metric.Value.NumberOfRequests, 0);
var currentAvgLatency = Interlocked.Exchange(ref metric.Value.AvgLatency, 0);
var tmp = numberOfRequests != 0 ? (double)currentAvgLatency / (double)numberOfRequests : 0;
report.Append($"{(int)metric.Key} - {numberOfRequests.ToString().PadLeft(6, ' ')} r/s, {tmp.ToString("F01").PadLeft(7, ' ')} ms | ");
}
report.AppendLine();
Console.Write(report.ToString());
}
catch(Exception ex)
{
Console.WriteLine(ex.Message);
}
await Task.Delay(1000);
}
}
}
public class MetricdData
{
public MetricdData()
{
NumberOfRequests = 0;
AvgLatency = 0;
}
public int NumberOfRequests;
public long AvgLatency;
}
can you re-run the test with DOTNET_SYSTEM_NET_SECURITY_DISABLETLSRESUME=1
?
And yes, for steady load you would see most time in encrypt/decrypt e.g. the read & write functions.
cc: @kouvel for thread pool insight.
cc: @kouvel for thread pool insight.
As far as I know not much has changed in the thread pool in .NET 7 on Linux that could affect perf in this sort of way.
@wfurt DOTNET_SYSTEM_NET_SECURITY_DISABLETLSRESUME=1 does not change anything.
I made the test on another machine, with RHEL 9 and everything is fine. The threadPool stabilizes at 17-20 threads and runs steady as on the .net 6 version. It uses around 20 percent less CPU than the .net 6 version (almost the same as http version of the test) which is great.
As i pointed previously the machine, where i am observing the problem is CentOs 7 with OpenSSL 1.0.2k-fips 26 Jan 2017
The new one is RHEL 9 with OpenSSL 3.0.1 14 Dec 2021 (Library: OpenSSL 3.0.1 14 Dec 2021)
I agree that encrypt/decrypt are time consuming ops, but they do not saturate the CPU which is strange. The threads count increases slowly, most of the stacks sit in these methods which i suppose call almost directly openSSL and the CPU is not saturated. And this changes with .net 7 on CentOS 7 OpenSSL 1.0.2k-fips 26 Jan 2017 It looks like some sort of contention. Also, most of the thread stacks on the RHEL 9 machine are not in encrypt/decrypt.
TLS resume was the biggest Linux change in SslStream
so DOTNET_SYSTEM_NET_SECURITY_DISABLETLSRESUME
was worth of shot. OpenSSL 1.0 is long out of support https://www.openssl.org/policies/releasestrat.html but I don't really see any changes in SslStream or PAL that would be impacted by the version. The Encrypt/Decrypt functions are pretty simple.
You can probably get debug info packages for OpenSSL and lldb would show the internals. cc: @tmds for more thoughts.
Tagging subscribers to this area: @dotnet/ncl, @vcsjones See info in area-owners.md if you want to be subscribed.
Author: | agerchev |
---|---|
Assignees: | - |
Labels: | `area-System.Net.Security`, `tenet-performance`, `needs-further-triage` |
Milestone: | - |
the code fragment does not build because of missing dependencies and it also seems to depend on external service. Can you craft something that can be just standalone repro @agerchev? Or check if you can demonstrate it with https://github.com/dotnet/performance tests?
@wfurt yes, i use System.CommandLine.DragonFruit, i forgot to mention, sorry.
I have made some tests, unfortunately i cannot simulate the whole setup with the external service in a standalone env.
Our setup is this: client (httpperformanceTool) -> haproxy (rhel9 tcp mode load balancer) -> 3 instances of nginx ingres controller (OpenShift) Everything is deployed on virtual machines (vmware) on premise. If I move the lb (haproxy) in another network or if i use only one ingress controller the problem does not appear. May be connected with our network environment somehow.
I have made a couple of screenshots on gdb and i hope this can be of a help. There are lots of threads with this call stack.
#0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
(gdb) bt
#0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1 0x00007f96b52b3e1b in _L_lock_812 () from /lib64/libpthread.so.0
#2 0x00007f96b52b3ce8 in __GI___pthread_mutex_lock (mutex=0x7f5514019918) at ../nptl/pthread_mutex_lock.c:79
#3 0x00007f55118e5713 in int_thread_get (create=0) at err.c:452
#4 0x00007f55118e5c1d in int_thread_get_item (d=0x7f5461ff7460) at err.c:495
#5 0x00007f55118e65b0 in ERR_get_state () at err.c:1026
#6 0x00007f55118e67f5 in ERR_clear_error () at err.c:743
#7 0x00007f553006eb40 in CryptoNative_SslWrite () from /home/atanas/httpPerformanceTool7/libSystem.Security.Cryptography.Native.OpenSsl.so
As you can see the thread count is over 280.
I don't think we would be able to for it out without good repro. Furthermore, CentOS 7 will no longer be supported. OpenSSL 1.0 is way too old. On the other hand 3.0 has it's own new issues: https://github.com/openssl/openssl/issues/20286
To make tis actionable I think we would need repro or perf test with Benchmark.NET on os supported by 8.0. I don't think we have resources to dig into otherwise.
Hello,
I have a tool that i use to stress our web applications over http/https. It uses httpClient with SocketsHttpHandler. When I migrated it to .net 7, I saw some performance degradation in https connections.
Description
The stress tests over http are almost the same. The .net 7 version uses fewer CPU which is great :)
But the tests over https are very different. As you can see, in the .net 6 version i can achieve 28102 r/s avg, but with the newer version i can barely achieve 17028 r/s avg. The two versions (.net 6, .net 7) starts with around 220 threads (ThreadPool Thread Count) then drops. The .net 6 version drops to around 17 and saturates more CPU till the end. The new version incresses the threads regularly after the initial drop and in the end of the test it uses 169 threads with litle CPU saturation.
Configuration
All tests are done from the same virtual machine: CentOS 7 Linux It has 12 CPUs
Regression?
This is a regression from version 6
Data
Analysis
I have dumped the managed stacks (dotnet-stack) for .net 7 and most of the stacks end here:
System.Net.Security!Interop+Ssl.SslWrite(class Microsoft.Win32.SafeHandles.SafeSslHandle,unsigned int8&,int32,value class SslErrorCode&)
System.Net.Security!Interop+Ssl.SslRead(class Microsoft.Win32.SafeHandles.SafeSslHandle,unsigned int8&,int32,value class SslErrorCode&)
System.Net.Security!Interop+Ssl.BioWrite(class Microsoft.Win32.SafeHandles.SafeBioHandle,unsigned int8&,int32)
System.Net.Security!Interop+Crypto.BioRead(class Microsoft.Win32.SafeHandles.SafeBioHandle,unsigned int8[],int32)