dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.91k stars 4.63k forks source link

Lock contention in SslStream.Encrypt/Decrypt #69042

Open davidfowl opened 2 years ago

davidfowl commented 2 years ago

Description

Continuing looking at various profiles that include HTTPS and not just looking at more than just CPU profiles since these benchmarks are IO bound. I'm reporting what I see in the profiles.

Configuration

Regression?

Nope

Data

image

Analysis

I haven't dug deeply into this yet but I'm a little confused why the handshake lock has any contention beyond the handshake (https://github.com/dotnet/runtime/blob/6ca8c9bc0c4a5fc1082c690b6768ab3be8761b11/src/libraries/System.Net.Security/src/System/Net/Security/SslStream.Implementation.cs#L146 and https://github.com/dotnet/runtime/blob/6ca8c9bc0c4a5fc1082c690b6768ab3be8761b11/src/libraries/System.Net.Security/src/System/Net/Security/SslStream.Implementation.cs#L788).

Profile attached: application.05-02-19-35-44.zip

ghost commented 2 years ago

Tagging subscribers to this area: @dotnet/ncl, @vcsjones See info in area-owners.md if you want to be subscribed.

Issue Details
### Description Continuing looking at various profiles that include HTTPS and not just looking at more than just CPU profiles since these benchmarks are IO bound. I'm reporting what I see in the profiles. ### Configuration - .NET Core 7.0.0-preview.5.22251.5+5f7f7e8 - Linux x64 ### Regression? Nope ### Data image ### Analysis I haven't dug deeply into this yet but I'm a little confused why the handshake lock has any contention beyond the handshake (https://github.com/dotnet/runtime/blob/6ca8c9bc0c4a5fc1082c690b6768ab3be8761b11/src/libraries/System.Net.Security/src/System/Net/Security/SslStream.Implementation.cs#L146 and https://github.com/dotnet/runtime/blob/6ca8c9bc0c4a5fc1082c690b6768ab3be8761b11/src/libraries/System.Net.Security/src/System/Net/Security/SslStream.Implementation.cs#L788)
Author: davidfowl
Assignees: -
Labels: `area-System.Net.Security`, `tenet-performance`, `untriaged`
Milestone: -
wfurt commented 2 years ago

It is not safe to encrypt and decrypt at the same time. The lock used to be in OpenSSL PAL but got moved and shared with handshake because of TLS 1.3 windows behavior.

wfurt commented 2 years ago

Here is the original PR https://github.com/dotnet/corefx/pull/37736

davidfowl commented 2 years ago

Hmm those end to end benchmarks are suspicious... AFAIK we don't have PlaintextNonPipelined that runs with https.

wfurt commented 2 years ago

There still may be some opportunity for improvements. For example, this also covers reading from BIO and copying data. AFAIK we only need to prevent SSL state being modified from two threads. On Windows the bug deal is Tls13. It uses the handshake code path after the handshakes is finished and concurrent writes fail with some weird error. So we combined the locks to one (end left it perhaps badly named)

antonfirsov commented 2 years ago

Triage: we should investigate this in 7.0

wfurt commented 1 year ago

https://www.openssl.org/docs/faq.html#PROG1

Is OpenSSL thread-safe?

Yes but with some limitations; for example, an SSL connection cannot be used concurrently by multiple threads. This is true for most OpenSSL objects.
For version 1.1.0 and later, there is nothing further you need do.
For earlier versions than 1.1.0, it is necessary for your application to set up the thread callback functions. To do this, your application must call CRYPTO_set_locking_callback(3) and one of the CRYPTO_THREADID_set… API’s. See the OpenSSL threads manpage for details and “note on multi-threading” in the INSTALL file in the source distribution.

The "handshake" name comes from Windows. It can actually go through the handshake logic from `Decrypt' during Tls13 negotiation and more rarely during renegotiation.

stephentoub commented 1 year ago

We should:

wfurt commented 1 year ago

Avoid locking entirely in situations where we don't need it, e.g. do we need it on Windows for TLS versions other than TLS 1.3?

I'm aware of renegotiation. That should be easy to check and it should be disabled in most cases. I will need to follow up with Schannel team to see if there are any other cases.

Would it make sense to push this down to the pal? Certainly for OpenSSL we can do that in C using their mechanisms or low level primitives just around invocation of the encrypt/decrypt calls.

wfurt commented 1 year ago

I did some measurements and we can probably get ~ 10% on single connection

tested on Server 2022

Method Job Toolchain Mean Error StdDev Median Min Max Ratio RatioSD Allocated Alloc Ratio
ConcurrentReadWrite Job-NPKZNC \test\corerun.exe 17.94 us 0.650 us 0.722 us 18.09 us 16.55 us 19.16 us 1.00 0.00 - NA
ConcurrentReadWrite Job-AYBFGA \main\corerun.exe 19.11 us 0.862 us 0.922 us 18.93 us 17.61 us 21.15 us 1.07 0.06 - NA
ConcurrentReadWriteLargeBuffer Job-NPKZNC \test\corerun.exe 21.61 us 1.012 us 1.083 us 21.88 us 19.17 us 23.80 us 1.00 0.00 - NA
ConcurrentReadWriteLargeBuffer Job-AYBFGA \main\corerun.exe 18.87 us 0.655 us 0.701 us 18.90 us 17.46 us 20.42 us 0.88 0.06 - NA
ParallelConcurrentReadWriteLargeBuffer Job-NPKZNC \test\corerun.exe 38.82 us 3.383 us 3.760 us 39.45 us 26.62 us 42.66 us 1.00 0.00 - NA
ParallelConcurrentReadWriteLargeBuffer Job-AYBFGA \main\corerun.exe 54.14 us 19.036 us 21.922 us 42.31 us 32.48 us 108.34 us 1.42 0.60 - NA

I added new test so see what the difference would be with multiple SslStreams. (one per CPU Core) But I cannot get stable numbers for some reason.