dotnet / aspnetcore

ASP.NET Core is a cross-platform .NET framework for building modern cloud-based web applications on Windows, Mac, or Linux.
https://asp.net
MIT License
35.47k stars 10.03k forks source link

Support certificate auto-rotation in Kestrel #32351

Closed aelij closed 1 year ago

aelij commented 3 years ago

Is your feature request related to a problem? Please describe.

In Kubernetes, certificates are mounted as secret volumes, which can be configured to update automatically when the cert is rotated (e.g. from Key Vault). To achieve auto-rotation in Kestrel today, we need to hook up ServerCertificateSelector and listen to file changes (e.g. using IFileProvider.Watch).

Describe the solution you'd like

Add a HttpsConnectionAdapterOptions.ServerCertificatePath property that would watch for file changes.

blowdart commented 3 years ago

This is a feature we're considering, above and beyond kubernetes. We're not at the design stage yet, but it's on the radar

ghost commented 3 years ago

We've moved this issue to the Backlog milestone. This means that it is not going to be worked on for the coming release. We will reassess the backlog following the current release and consider this item at that time. To learn more about our issue management process and to have better expectation regarding different types of issues you can read our Triage Process.

davidfowl commented 3 years ago

@blowdart this is something I think we should support natively. We did work in .NET 5 to make Kestrel respect configuration reload but that doesn't work well for things in configuration that change without configuration itself changing.

This is going to affect YARP as well @Tratcher.

@aelij Are you using certmgr?

aelij commented 3 years ago

@davidfowl No, we're planning on using the new Secrets Store CSI Driver once it's out of preview, which natively supports auto-rotation.

ghost commented 3 years ago

Thanks for contacting us.

We're moving this issue to the Next sprint planning milestone for future evaluation / consideration. We would like to keep this around to collect more feedback, which can help us with prioritizing this work. We will re-evaluate this issue, during our next planning meeting(s). If we later determine, that the issue has no community involvement, or it's very rare and low-impact issue, we will close it - so that the team can focus on more important and high impact issues. To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

adityamandaleeka commented 3 years ago

@blowdart Removing this from 6. Please move it back if you think it should get done.

ghost commented 3 years ago

We've moved this issue to the Backlog milestone. This means that it is not going to be worked on for the coming release. We will reassess the backlog following the current release and consider this item at that time. To learn more about our issue management process and to have better expectation regarding different types of issues you can read our Triage Process.

amcasey commented 1 year ago

Certs can be configured either in code or via IConfiguration (e.g. in appsettings.json). There's already a mechanism for opting into reloading when the IConfiguration changes and it would be pretty straightforward to model a change to the certificate file as a change to the configuration. Code-based configuration is a little trickier, since it doesn't presently have a notion of "change".

@aelij @craigktreasure Is IConfiguration-based configuration sufficient for your use cases? I can imagine that you might be getting a dynamic path at runtime, making this impossible.

aelij commented 1 year ago

IConfiguration should be enough, provided it can support multiple certs. Just to be sure we're on the same page, the cert's file path would not change in our case.

amcasey commented 1 year ago

@aelij Tell me more about the multiple certs? I assume you mean different certs for different endpoints?

Yes, we're talking about the cert file changing in-place without a corresponding change to the path or (e.g.) appsettings.json.

aelij commented 1 year ago

I assume you mean different certs for different endpoints?

Yes

craigktreasure commented 1 year ago

In my case, it was single cert and the path was specified on disk via IConfiguration through an environment variable.

amcasey commented 1 year ago

Excellent. If the IConfiguration-based approach works for everyone, we should be able to do this without introducing new API surface area for consumers to reason about. I'll follow up on that approach and report back.

Tratcher commented 1 year ago

What about detecting changes to certificates in the cert store?

davidfowl commented 1 year ago

Maybe it would also be ideal to have an explicit gesture to allow users to poll an external source and force reload?

amcasey commented 1 year ago

What about detecting changes to certificates in the cert store?

All the requests I've seen have been for file watching, so I was prioritizing that. Is monitoring the cert store interesting?

amcasey commented 1 year ago

Maybe it would also be ideal to have an explicit gesture to allow users to poll an external source and force reload?

Do you have a use case in mind? What sort of resource would the server be polling? If we were to go that route, how configurable would it be? Would you specify a polling interval and an error handling behavior?

davidfowl commented 1 year ago

All the requests I've seen have been for file watching, so I was prioritizing that. Is monitoring the cert store interesting?

Avoding platform specific code in Kestrel would be best. If this is trivial to do (which I assume it is not 😄), maybe we can look into it.

Do you have a use case in mind? What sort of resource would the server be polling? If we were to go that route, how configurable would it be? Would you specify a polling interval and an error handling behavior?

Key vault, database, anything that isn't disk storage. Maybe we should focus this effort solely on IConfiguration and the people can use the callback for anything else.

amcasey commented 1 year ago

From the threads I've seen, people use external tools to manage certificate rotation and those tools basically give you a path to an always-current cert - hence the focus on file watching.

davidfowl commented 1 year ago

Sounds good to me.

pinkfloydx33 commented 1 year ago

We are using letsencrypt and AppService certificates in K8s and just hit a case where my Yarp gateway went down because it didn't load the new certificates that were automatically mounted in. Would be great if it was supported natively.

@aelij how exactly does it work with ServerCertificateSelector and IFileProvider ? Right now we are relying on the path of the certificate and key being specified in Kestrel:Endpoints:Https:Sni:XYZ:Certificate with multiple certificates. Does that mean we'd not be able to specify it there anymore?

aelij commented 1 year ago

@pinkfloydx33 We haven't implemented it, it was just an idea. IIRC you can use the string parameter in the selector's delegate to get the request's host name (@amcasey the docs for ServerCertificateSelector could make this clearer), and from there resolve the cert. IFileProvider could be used to cache the last updated cert file as an X509Certificate2 object. You'll need some config to map the host names to the cert files (possibly even still use Sni:XYZ:Certificate).

craigktreasure commented 1 year ago

@amcasey I'm concerned with the mention in #49979 not supporting symlinks. That's likely a limitation of the file watcher, which is why that solution never worked for me when I tried something similar. The use case for my scenario was using certificates retrieved using the Secret Store CSI Driver in AKS. Is that being solved elsewhere? Have you tested that particular scenario?

aelij commented 1 year ago

+1. That's our primary use case as well. According to the driver's docs:

Secrets Store CSI Driver uses atomic writer to write the secret files. This is the same writer used by Kubernetes to write secret, configmap and downward API volumes. Atomic writer relies on symlinks to update the content of the file. The secret file is bind mounted into the container and is a symlink to the actual secret file in a timestamped directory.

pinkfloydx33 commented 1 year ago

This is the same writer used by Kubernetes to write secret,

Which would also seem to preclude mounting certificates stored in Secret objects as well, for example via cert-manager/letsencrypt, or AppService certificates synced from keyvault (with external-secrets or akv2k8s).

This remains our primary use-case as well. At the moment we're using reloader to do a rolling restart of our Yarp Gateway when it detects an updated certificate secret so it's not technically blocking, but we'd really prefer to not do that.

amcasey commented 1 year ago

@craigktreasure @aelij To confirm, the configuration file will contain the path to a symlink and neither the configuration file nor the symlink will change, only the symlink target?

craigktreasure commented 1 year ago

@amcasey In my case, I set the ASPNETCORE_Kestrel__Certificates__Default__Path environment variable to the location of the certificate mounted into the container, which Kestrel picks up and uses during application startup. The rest of the details are as @aelij mentioned in terms of how the certificate is presented in the container. ~Yes, the symlink target is what gets updated.~

aelij commented 1 year ago

@amcasey Just to make sure I understand: symlink change = symlink points to a different file symlink target change = file pointed to by symlink changes content

If I understand the K8s docs correctly, the former is what's happening. But I suggest testing it out to make sure.

amcasey commented 1 year ago

@aelij My k8s experience is pretty limited and I'd rather not block this feature on my ramping up, since we're getting close to the 8.0 deadline (sorry, I know I dropped this for a couple months - there was a security thing). Is there any chance I could provide you and/or @craigktreasure with private binaries you could validate in your actual scenario(s)?

I'm hoping to have a PR up with symlink support today or tomorrow.

aelij commented 1 year ago

I can try. What's the deadline?

(BTW can you verify the symlink terminology above?)

amcasey commented 1 year ago

Sorry, yes, I did forget about the terminology. Yes, that's what I had in mind (though, strictly speaking, I believe there are other FS operations that can cause the symlink FS entry to be updated besides giving it a new target). And, to be explicit, I'm not planning to drop support for changes to non-symlinks - I just wanted to make sure I understood the scenario.

The deadline's a bit hand-wavey at the moment, but I would not expect it to stretch into next week.

amcasey commented 1 year ago

Oh, I suddenly see how ambiguous my question was.

Suppose a.pfx is a symlink to b/c.pfx. Two possible ways of updating the certificate used by kestrel are (a) to change a.pfx to point to d/e.pfx and (b) to leave a.pfx unchanged and replace the contents of b/c.pfx in place.

When I said "symlink target change", I had approach (b) in mind. The code I merged on Friday will not detect that as a configuration change and trigger a reload.

However, I believe I understand from @aelij's comment that k8s (et al) is actually using approach (a), which I would expect to be handled (i.e. trigger a reload) because I believe the file system will update the mtime when that happens.

So, possibly, things are already in a good state. My change from last week should already be in nightly builds, if someone is in a position to validate with them.

Because of the deadline and the time zone differences, I'm going to try to prepare an additional change that also handles approach (b) without waiting for confirmation that we're only interested in (a) right now. However, I'd be reluctant to ship it if the previous change sufficed because it will add a lot of complexity and file watching is brittle enough as it is.

pinkfloydx33 commented 1 year ago

When you mount a Secret/Certificate/ConfigMap, etc. as a volume, Kubernetes:

I just mounted a file tls.key with some content to the /keys directory of a pod. Here's the output of ls -lia and stat on each of the files/folders in question

/keys # ls -lia
total 4
      1 drwxrwxrwt    3 root     root           100 Aug 14 19:15 .
4907185 drwxr-xr-x    1 root     root          4096 Aug 14 19:14 ..
      6 drwxr-xr-x    2 root     root            60 Aug 14 19:15 ..2023_08_14_19_15_53.1642799742
      8 lrwxrwxrwx    1 root     root            32 Aug 14 19:15 ..data -> ..2023_08_14_19_15_53.1642799742
      5 lrwxrwxrwx    1 root     root            14 Aug 14 19:14 tls.key -> ..data/tls.key
/keys # cd ..data
/keys/..data # ls -lia
total 8
      6 drwxr-xr-x    2 root     root            60 Aug 14 19:15 .
      1 drwxrwxrwt    3 root     root           100 Aug 14 19:15 ..
      7 -rw-r--r--    1 root     root          5587 Aug 14 19:15 tls.key
/keys/..data # cd ..
/keys # cd ..2023_08_14_19_15_53.1642799742/
/keys/..2023_08_14_19_15_53.1642799742 # ls -lia
total 8
      6 drwxr-xr-x    2 root     root            60 Aug 14 19:15 .
      1 drwxrwxrwt    3 root     root           100 Aug 14 19:15 ..
      7 -rw-r--r--    1 root     root          5587 Aug 14 19:15 tls.key

/keys # stat tls.key
  File: 'tls.key' -> '..data/tls.key'
  Size: 14              Blocks: 0          IO Block: 4096   symbolic link
Device: 100075h/1048693d        Inode: 5           Links: 1
Access: (0777/lrwxrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2023-08-14 19:15:53.747921459 +0000
Modify: 2023-08-14 19:14:46.526843670 +0000
Change: 2023-08-14 19:14:46.526843670 +0000
/keys # stat ..data
  File: '..data' -> '..2023_08_14_19_15_53.1642799742'
  Size: 32              Blocks: 0          IO Block: 4096   symbolic link
Device: 100075h/1048693d        Inode: 8           Links: 1
Access: (0777/lrwxrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2023-08-14 19:16:54.745026564 +0000
Modify: 2023-08-14 19:15:53.747921459 +0000
Change: 2023-08-14 19:15:53.747921459 +0000

When I update the Secret/ConfigMap, after a few moments:

Here's the updated file/directory stats:

/keys # ls -lia
total 4
      1 drwxrwxrwt    3 root     root           100 Aug 14 19:25 .
4907185 drwxr-xr-x    1 root     root          4096 Aug 14 19:14 ..
      9 drwxr-xr-x    2 root     root            60 Aug 14 19:25 ..2023_08_14_19_25_58.3230534830
     11 lrwxrwxrwx    1 root     root            32 Aug 14 19:25 ..data -> ..2023_08_14_19_25_58.3230534830
      5 lrwxrwxrwx    1 root     root            14 Aug 14 19:14 tls.key -> ..data/tls.key
/keys # cd ..data
/keys/..data # ls -lia
total 4
      9 drwxr-xr-x    2 root     root            60 Aug 14 19:25 .
      1 drwxrwxrwt    3 root     root           100 Aug 14 19:25 ..
     10 -rw-r--r--    1 root     root          1679 Aug 14 19:25 tls.key
/keys/..data # cd ../..2023_08_14_19_25_58.3230534830/
/keys/..2023_08_14_19_25_58.3230534830 # ls -lia
total 4
      9 drwxr-xr-x    2 root     root            60 Aug 14 19:25 .
      1 drwxrwxrwt    3 root     root           100 Aug 14 19:25 ..
     10 -rw-r--r--    1 root     root          1679 Aug 14 19:25 tls.key
/keys/..2023_08_14_19_25_58.3230534830 # cd ..
/keys # stat tls.key
  File: 'tls.key' -> '..data/tls.key'
  Size: 14              Blocks: 0          IO Block: 4096   symbolic link
Device: 100075h/1048693d        Inode: 5           Links: 1
Access: (0777/lrwxrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2023-08-14 19:15:53.747921459 +0000
Modify: 2023-08-14 19:14:46.526843670 +0000
Change: 2023-08-14 19:14:46.526843670 +0000
/keys # stat ..data
  File: '..data' -> '..2023_08_14_19_25_58.3230534830'
  Size: 32              Blocks: 0          IO Block: 4096   symbolic link
Device: 100075h/1048693d        Inode: 11          Links: 1
Access: (0777/lrwxrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2023-08-14 19:27:14.799341316 +0000
Modify: 2023-08-14 19:25:58.698086108 +0000
Change: 2023-08-14 19:25:58.698086108 +0000

You'll notice that the file's timestamp is still 19:14 but the one in the ..data/ folder is now 19:25.

This indirection is how Kubernetes performs atomic file mounts/updates. For what it's worth, this works with normal configuration reloads via appsettings files. The changes are detected just fine--at least if you use DOTNET_USE_POLLING_FILE_WATCHER. I'm not sure what happens without it.

Hope that helps

amcasey commented 1 year ago

That helps a lot, @pinkfloydx33, thanks! So, if I'm reading your response correctly, that's my approach (b) (with the additional wrinkle that the target of the first symlink is itself a symlink).

@aelij in your original request, which tls.key would you have pointed your HttpsConnectionAdapterOptions.ServerCertificatePath at? The one in the root, I assume? (My change from last week would probably work if you put ..data/tls.key in your config, but I assume that's bad practice.)

pinkfloydx33 commented 1 year ago

The ..data folder is an implementation detail. Nobody would ever try and reference the file via that path. You're expected to use the file at its requested mount point, which in my example above is /keys/tls.key and not /keys/..data/tls.key

amcasey commented 1 year ago

Got it. I still can't easily set up k8s, but I should be able to replicate the symlink graph and do some validation that way. Thanks again!

davidfowl commented 1 year ago

Anything here should be done at the FileProvider level, not in kestrel.

cc @ericstj

amcasey commented 1 year ago

The fact that FileProvider doesn't expose link information is a bit frustrating.

Are we likely to rev FileProvider in time for 8.0? The feature has no public surface area, so we could easily move the implementation in the future.

amcasey commented 1 year ago

As a bonus, if we put the implementation in FileProvider it should also work for config files (which, AFAIK, aren't presently considered updated if the linked-to file changes).

davidfowl commented 1 year ago

I think following symlinks could be a top-level setting that applies to all resolved files on the PhysicalFileProvider. It might be possible to build a POC of this today using File.ResolveLinkTarget

ericstj commented 1 year ago

cc @dotnet/area-extensions-filesystem

The changes are detected just fine--at least if you use DOTNET_USE_POLLING_FILE_WATCHER.

@Jozkee added this support and IIRC we need polling to handle watching symlinks since the OS APIs to detect changes don't work at all with symlinks. I thought we made DOTNET_USE_POLLING_FILE_WATCHER the default K8s. I'm not sure what the state of this issue is - but if you're finding a bug here or have a feature request for FileProvider it might be good to restate that. We're too late to add features for 8.0 but we can fix important bugs.

davidfowl commented 1 year ago

So it probably works already 😄

amcasey commented 1 year ago

This is as far as I got today: https://github.com/dotnet/aspnetcore/pull/50074

It may be unnecessary if k8s is using polling and it may or may not handle changes to the ..data directory (for which I don't think we presently receive events). I still need to do some validation of the layout described above and, as @davidfowl pointed out, a bunch of this code might fit better in FileProvider.

@ericstj I think there were basically two requests here:

  1. Expose symlink information from, e.g., IFileInfo.
  2. Add a flag to IFileProvider.Watch to let the caller specify whether or not they'd like to watch resolved link targets as well.

I agree that neither is likely to happen in 8.0, especially given that we can implement it in kestrel (which may, itself, be redundant).

@aelij @craigktreasure I assume that, to validate private binaries, you'd want them to be built on 7.0?

aelij commented 1 year ago

I can build a 8.0 test container.

davidfowl commented 1 year ago

I'm confused as to why we're doing anything at all if this works with DOTNET_USE_POLLING_FILE_WATCHER (since that's the default in containers anyways or at least what we recommend to see file changes)?

craigktreasure commented 1 year ago

Not sure when DOTNET_USE_POLLING_FILE_WATCHER was made the default in containers, but that could have affected previous attempts I made when trying to use the file watcher. At this point, we just need to test it out. Situation has changed for me, so unfortunately I won't be able to help test it out.

aelij commented 1 year ago

I have set up an AKS cluster with Key Vault CSI driver. I can confirm that the solution isn't working using binaries from the official build for https://github.com/dotnet/installer/commit/1c0b692a4e0ae5e7ccf6495dc09b1a10f159a69f (verified the binaries contained the CertificatePathWatcher class).

I tested using openssl (openssl s_client -connect localhost:5001). I can see the PEM file is updated (cat /certs/cert1.crt) but Kestel still returns the old cert after rotation.

Kestrel configuration used:

{
  "Kestrel": {
    "Endpoints": {
      "Https": {
        "Url": "https://*:5001",
        "Certificate": {
          "Path": "/certs/cert1.crt",
          "KeyPath": "/certs/cert1.key"
        }
      }
    }
  }
}

Trace logs (after rotation was verified):

dbug: Microsoft.AspNetCore.Server.Kestrel.Core.Internal.CertificatePathWatcher[4]
      Created directory watcher for '/certs'.
dbug: Microsoft.AspNetCore.Server.Kestrel.Core.Internal.CertificatePathWatcher[5]
      Created file watcher for '/certs/cert1.crt'.
trce: Microsoft.AspNetCore.Server.Kestrel.Core.Internal.CertificatePathWatcher[12]
      Added observer to file watcher for '/certs/cert1.crt'.
trce: Microsoft.AspNetCore.Server.Kestrel.Core.Internal.CertificatePathWatcher[14]
      File '/certs/cert1.crt' now has 1 observers.
trce: Microsoft.AspNetCore.Server.Kestrel.Core.Internal.CertificatePathWatcher[15]
      Directory '/certs' now has watchers on 1 files.
info: Microsoft.AspNetCore.Server.Kestrel.Https.Internal.HttpsConnectionMiddleware[9]
      Certificate with thumbprint 893F2F7620DC51DE9A50CD34335C9206B48A85F6 lacks the subjectAlternativeName (SAN) extension and may not be accepted by browsers.
info: Microsoft.Hosting.Lifetime[14]
      Now listening on: https://[::]:5001
info: Microsoft.Hosting.Lifetime[0]
      Application started. Press Ctrl+C to shut down.
info: Microsoft.Hosting.Lifetime[0]
      Hosting environment: Production
info: Microsoft.Hosting.Lifetime[0]
      Content root path: /app
aelij commented 1 year ago

I created a repo with an automated deployment script if someone else wants to try it out: https://github.com/aelij/kestrel-aks-kv-rotation

craigktreasure commented 1 year ago

Found this, which at that time indicated that DOTNET_USE_POLLING_FILE_WATCHER would not be set by default. @aelij, can you try again setting that variable explicitly?

aelij commented 1 year ago

Interesting, it's still not working with DOTNET_USE_POLLING_FILE_WATCHER, however there's a new log line that hints it could be easy to fix:

trce: Microsoft.AspNetCore.Server.Kestrel.Core.Internal.CertificatePathWatcher[17]
      Ignored redundant event for '/certs/cert1.crt'.