dotnet / aspnetcore

ASP.NET Core is a cross-platform .NET framework for building modern cloud-based web applications on Windows, Mac, or Linux.
https://asp.net
MIT License
35.15k stars 9.92k forks source link

RemoteCertificateValidationCallback not being called when authority isn't trusted #34869

Closed saltxwater closed 3 years ago

saltxwater commented 3 years ago

Describe the bug

Using Kestrel with HTTPS, I have configured the SslServerAuthenticationOptions with RemoteCertificateValidationCallback, but this isn't being called (both in windows or linux docker) when client supplies a certificate signed by an untrusted root. An exception is thrown and the request never makes it anywhere into my middleware.

An authentication exception is thrown from SslStream: Authentication failed, see inner exception. Inner Exception: Win32Exception: The certificate chain was issued by an authority that is not trusted.

I expected in this case that the validation callback would be called and I can then check the client certificate manually.

The use case is trying to create a Consul Connect Native app - certificates are signed by consul, but consul's CA root may change over time (listeners will be in place to detect changes). Regardless I did also try adding the certificate manually to my windows trust store but that didn't seem to have any effect... must have been doing something wrong!

Do you know of any way I can get around this? If I can't have a simple callback to manually verify the client certificate then I'd at least being able to add trusted root certificates locally at runtime.

To Reproduce

I don't have a repro project to hand but it should be straightforward to put together. I will update this post when it is ready

Exceptions (if any)

Authentication failed, see inner exception. Inner Exception: Win32Exception: The certificate chain was issued by an authority that is not trusted.

System.Net.Security.dll!System.Net.Security.SslStream.ForceAuthenticationAsync(System.Net.Security.AsyncReadWriteAdapter adapter, bool receiveFirst, byte[] reAuthenticationData, bool isApm) Line 393 C# ... ... [Completed] Microsoft.AspNetCore.Server.Kestrel.Core.dll!Microsoft.AspNetCore.Server.Kestrel.Core.Internal.DuplexPipeStream.ReadAsyncInternal(System.Memory destination, System.Threading.CancellationToken cancellationToken) Line 151
... ... [Async] Microsoft.AspNetCore.Server.Kestrel.Core.dll!Microsoft.AspNetCore.Server.Kestrel.Https.Internal.HttpsConnectionMiddleware.OnConnectionAsync(Microsoft.AspNetCore.Connections.ConnectionContext context) Line 162 C#

Further technical details

Runtime Environment: OS Name: Windows OS Version: 10.0.18363 OS Platform: Windows RID: win10-x64 Base Path: C:\Program Files\dotnet\sdk\5.0.301\

Host (useful for support): Version: 5.0.7 Commit: 556582d964

.NET SDKs installed: 3.1.101 [C:\Program Files\dotnet\sdk] 5.0.301 [C:\Program Files\dotnet\sdk]

.NET runtimes installed: Microsoft.AspNetCore.All 2.1.28 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All] Microsoft.AspNetCore.App 2.1.28 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 3.1.1 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 3.1.16 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 5.0.7 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App] Microsoft.NETCore.App 2.1.28 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 3.1.1 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 3.1.16 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 5.0.7 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App] Microsoft.WindowsDesktop.App 3.1.1 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App] Microsoft.WindowsDesktop.App 3.1.16 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App] Microsoft.WindowsDesktop.App 5.0.7 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App] `

blowdart commented 3 years ago

Without a repo, or your config code this is hard to diagnose. Alternatively

saltxwater commented 3 years ago

I understand that the lack of information makes it difficult to diagnose, hopefully this will help:

WebHost.CreateDefaultBuilder()
    .ConfigureServices(ConfigureServices)
    .Configure(Configure)
    .ConfigureKestrel(options =>
    {
        options.ConfigureHttpsDefaults(ops => // I don't think these defaults are used when UseHttps is used with a callback.. but I've added them anyway just incase
        {
            ops.ClientCertificateValidation = (certificate2, chain, arg3) => true; // Not called
            ops.ClientCertificateMode = ClientCertificateMode.AllowCertificate;
            ops.OnAuthenticate = OnAuthenticate;
        });
        options.Listen(IPAddress.Any, 8080, listenOptions =>
        {
            listenOptions.Protocols = HttpProtocols.Http1AndHttp2;
            listenOptions.UseHttps(async (SslStream stream,
                SslClientHelloInfo clientHelloInfo, object? state, CancellationToken cancellationToken) =>
            {
                var ops = new SslServerAuthenticationOptions
                {
                    ClientCertificateRequired = true
                };
                ops.ServerCertificate = await MyConsulConnectManager.GetLeafCertificate(); // Server certificate loaded dynamically, provided by consul connect
                ops.RemoteCertificateValidationCallback = (sender, certificate, chain, errors) =>
                {
                    Console.WriteLine("EXPECT TO HIT HERE FOR CUSTOM VALIDATION"); // Not called
                    return true;
                };
                ops.CertificateRevocationCheckMode = X509RevocationMode.NoCheck;
                return ops;
            }, null);
        });
    })
    .Build();

I'm using Jwt authentication middleware but I tried adding some debug middleware right after app.UseRouting() (before auth middleware) but the middleware isn't called - I think because the handshake is failing nothing is getting through.

I'm currently looking through the source for Kestrel HttpsConnectionMiddleware but cannot see where the RemoteCertificateValidation callback is added... https://github.com/dotnet/aspnetcore/blob/8b30d862de6c9146f466061d51aa3f1414ee2337/src/Servers/Kestrel/Core/src/Middleware/HttpsConnectionMiddleware.cs#L320

SslStream seems to deal with it, so potentially the middleware is failing to add it? https://github.com/dotnet/runtime/blob/12c4e4cd63e4d8daf2ad4268876a69f8cda606a7/src/libraries/System.Net.Security/src/System/Net/Security/SslStream.cs#L411

Will keep investigating

saltxwater commented 3 years ago

Update:

If I use a different "UseHttps" now (HttpsConnectionAdapterOptions) and using

ops.ServerCertificateSelector = (context, s) => MyConsulConnectManager.GetLeafCertificate().GetAwaiter().GetResult(); // yuck
ops.ClientCertificateValidation = (certificate2, chain, arg3) =>
                            {
                                Console.WriteLine("Want to validate here!");
                                return true;
                            };

I can see the remote validation callback has been added to SslStream (wrapped in a callback within HttpsConnectionMiddleware) but the exception is still being thrown and callback isn't called.

So perhaps it's the particular usage of ServerOptionsSelectionCallback that doesn't get the remote validation callback added to sslstream?

However even with the callback present on sslstream, it doesn't work:

I'm now looking at the point the exception is thrown: https://github.com/dotnet/runtime/blob/12c4e4cd63e4d8daf2ad4268876a69f8cda606a7/src/libraries/System.Net.Security/src/System/Net/Security/SslStream.Implementation.cs#L414

I don't think the validation callbacks are called until CompleteHandshake (line 434 above). So perhaps I need to move this issue over to dotnet runtime?

blowdart commented 3 years ago

What about if you use the built-in client certificate authentication, rather than roll your own?

saltxwater commented 3 years ago

I believe this problem is prior to any authentication takes place - it's during the connection setup. I get the impression that windows refuses mutual TLS when the client certificate isn't signed by a trusted root.

I've got around the problem for now by manually adding any and all CA roots and intermediates to the appropriate X509Store as they get loaded from consul... not a nice solution as it means running my service is making changes to the users certificate store - something I wanted to avoid! But it does work. At runtime, I add the certificates and subsequent connections succeed (and proceed to the remote validation callback and rest of the middleware).

halter73 commented 3 years ago

So perhaps it's the particular usage of ServerOptionsSelectionCallback that doesn't get the remote validation callback added to sslstream?

I noticed this when adding the new UseHttps overload that takes the SslServerAuthenticationOptions-returning callback and filed https://github.com/dotnet/runtime/issues/40402. We take the SslServerAuthenticationOptions returned by the UseHttps callback and return it directly without modification to the callback we pass to SslStream.AuthenticateAsServerAsync.

@wfurt Any ideas? I'm not sure what you mean by "negation" in :

I did more testing and the documentation is misleading IMHO. I did testing with 3.1 and the negation fails unless you provide override. With 'ClientCertificateRequired` and client not providing certificate, negations may succeed. I'll try to update docs to make that more clear.

The missing callback was fixed as part of #40110 (since that was pending and touching relevant area)

Should I transfer this issue? At the vary least, there seems to be a lack of parity when passing a RemoteCertificateValidationCallback via the SslStream constructor vs returning it from a ServerOptionsSelectionCallback.

halter73 commented 3 years ago

@saltxwater I tried creating a simple repro based on the description provided, but I'm not seeing the same issue. See https://github.com/halter73/Test34869/blob/main/Program.cs. This outputs the following which is what I would expect:

C:\dev\halter73\Test34869 main ≡ ❯ dotnet run
Building...
warn: Microsoft.AspNetCore.Server.Kestrel[0]
      Overriding address(es) 'https://localhost:5001, http://localhost:5000'. Binding to endpoints defined in UseKestrel() instead.
info: Microsoft.Hosting.Lifetime[0]
      Now listening on: https://localhost:5001
info: Microsoft.Hosting.Lifetime[0]
      Application started. Press Ctrl+C to shut down.
info: Microsoft.Hosting.Lifetime[0]
      Hosting environment: Development
info: Microsoft.Hosting.Lifetime[0]
      Content root path: C:\dev\halter73\Test34869
EXPECT TO HIT HERE FOR CUSTOM VALIDATION
Response: Hello World!

Can you modify my repro to demonstrate the error? I wonder if it has something to do with the certs.

wfurt commented 3 years ago

As far as the flow, we tell channel to do manual validation so we should not get to place you pointed at because of the validity @saltxwater . I'm wondering if this could fail for some other reasons like ciphers or protocols version. It may be helpful to get packet capture.

saltxwater commented 3 years ago

@halter73 Thank you for the provided repro attempt. I've tried running it as provided and I agree that it works. I then tried changing the hosted (and client) certificates to ones provided to me by Consul and again that works. (Including having removed the Consul CA from my trusted windows certificates.

The only way I've found to reproduce it is by getting the consul ingress gateway (Envoy) running in our cluster to send the request. When envoy tries to connect to your demo Program I get the same win32 failure and the callback isn't called.

I then added the consul CA to my certificate store and the connection from envoy succeeded. I've visually tried to compare the HttpContext between the local httpclient connection and successful remote envoy connection. Both connections seem to be pretty much the same:

But they DO differ in KeyExchangeStrength. Locally it is 384, from envoy it is 255.

I don't know how or why this would affect the server listener behavior but I guess it does suggest that envoy is behaving differently.

I'll try to get envoy running locally (without consul connect) and see if I can provide a reproducible configuration.

@wfurt I just tried installing the SharpPcap package but found the examples a bit overwhelming. Do you know of an example/other package which I could just sit in front of aspnet to collect all packets and forward them between?

wfurt commented 3 years ago

Use https://www.wireshark.org/#download and set filter to either "port 5001" or "host client". Get the error, it may provide some explanation. It seems like this is specific to specific client or specific certificate.

saltxwater commented 3 years ago

@wfurt Thank you for your help with suggesting Wireshark - I've managed to intercept the packets and it revealed a very frustrating realisation that I've neglected to consider. Bottom line: it's not the server that's failing the TLS handshake - it's the client (envoy)!

Right after the server sends it's Server Hello, the client is replying with Alert (Level: Fatal, Description: Unknown CA).

Given that I was getting success when adding the intermediate and CA to my certificate store I was led to believe it was a hosting problem (and it still sort of is). It turns out that SslStream doesn't send the full certificate chain when the intermediate isn't in the certificate store. (I guess, how could it? The X509Certificate2 I'm using for the server only contains the leaf certificate). I'm quite surprised that envoy isn't accepting the server certificate given that envoy should have access to the intermediate and CA it loads from consul, so I may also chase this up with hashicorp.

Seems after all this is a duplicate issue: https://github.com/dotnet/aspnetcore/issues/10971

I don't know what the outcome of that issue was - it would be extremely useful if it could be followed up and there was a way to provide the full chain for the server certificate, without needing to modify my certificate store.

@halter73 Thank you also for your help with trying to diagnose this issue.

saltxwater commented 3 years ago

Following this I've found SslServerAuthenticationOptions.ServerCertificateContext which you can specify instead of a ServerCertificate, and it includes the option to provide a full chain... Running it once throws a cryptographic exception (access is denied) while it attempts to look in the Stores. Running it again works, and subsequent client queries (from envoy) are working, but I realise that SslStream (or some other subsystem) has gone ahead and added those certificates to the Store - something I was trying to avoid.

I've kind of given up with this and will just manage these certificates in the X509Store myself.

wfurt commented 3 years ago

There is no way how .NET can explicitly pass certificate chain to Schannel. So ServerCertificateContext will populate the intermediate certs store if needed to make sure certificate chain is sent as needed. BTW if certificate chain is root cause you would see that with Wireshark as well when server sends ServerHello and presents own certificate. RFC says the it should send certificate chain -1 e.g. no root.

ghost commented 3 years ago

This issue has been resolved and has not had any activity for 1 day. It will be closed for housekeeping purposes.

See our Issue Management Policies for more information.