envoyproxy / gateway

Manages Envoy Proxy as a Standalone or Kubernetes-based Application Gateway
https://gateway.envoyproxy.io
Apache License 2.0
1.58k stars 347 forks source link

feature: supports OCSP stapling for client tls connections #3826

Open zhaohuabing opened 3 months ago

zhaohuabing commented 3 months ago

Description:

Describe the desired behavior, what scenario it enables and how it would be used.

Support OCSP stapling for client traffic.

Online Certificate Status Protocol (OCSP) is a standard for checking the revocation status of X.509 digital certificates. With OCSP stapling, the server side takes on the responsibility of fetching the OCSP response from the CA at regular intervals. The server then “staples” this response to the TLS handshake, sending it to the client along with the certificate. This means the client can retrieve the status of the certificate from the staple, without needing to contact the CA directly.

OCSP stapling can be incorporated into the EG ClientTrafficPolicy, the rotation of the OCSP staple should be offloaded to a third-part(user installed application).

[optional Relevant Links:]

Any extra documentation required to understand the issue.

Envoy proxy docs on OCSP stapling: https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/security/ssl Original discussion from Slack: https://envoyproxy.slack.com/archives/C03E6NHLESV/p1720644444477899

travisghansen commented 3 months ago

https://github.com/cert-manager/cert-manager/issues/5785

zhaohuabing commented 3 months ago

cert-manager/cert-manager#5785

Nice. Cert manager can be used to get OCSP staple and store it in a secret. Then EG can refer to this secret directly.

travisghansen commented 3 months ago

cert-manager itself doesn’t do it directly, that is what that tickets discusses.

guydc commented 3 months ago

IMHO, this is too much of an "online" thing to delegate to a third party controller to do asynchronously. In a perfect scenario, envoy can handle this directly, as some other LBs do:

The design for OCSP stapling did not rule-out future support for this in envoy: https://docs.google.com/document/d/14Ji0Vq7Xbe9LXM6IsWQo8mgEnOB8Bo6TH75sSmFbCEE/edit

Envoy will not directly make OCSP requests to CAs. The implementation proposed here must not preclude a future enhancement that allows Envoy to transmit OCSP requests to a CA, cache the response locally, and use it for OCSP stapling.

I think that if EG could manage OCSP staple lifecycle end-to-end, it will provide great value for users and close a gap in the current envoy implementation. Ofcourse, we can also support an option to use user-provided staples from a k8s secret, as required here, as a first phase.

travisghansen commented 3 months ago

I like the above comment. It would be pretty easy to detect the presence of the ocsp key in the relevant secret and then trickle down logic from there. For what it’s worth we do use the nginx implementation now and found some basic issues with it:

EDIT: we use the ingress-nginx implementation, which is not the native module, it’s a lua based plugin builtin to the controller, I suspect they went that route due to limitations in the linked module

zhaohuabing commented 3 months ago

@guydc @travisghansen thanks for your thoughtful inputs. Based on the discussion, we can begin by supporting loading of OCSP staples from a Kubernetes secret.

zhaohuabing commented 3 months ago

I like the above comment. It would be pretty easy to detect the presence of the ocsp key in the relevant secret and then trickle down logic from there. For what it’s worth we do use the nginx implementation now and found some basic issues with it:

  • the fetching is async and only happens ‘on-demand’, which means there will always be requests that fail to provide the ocsp data stapled
  • the refresh interval is hard-coded to 7 days I believe which means it doesn’t respect the validity period as set in the response (to be fair 7 is the most common, but isn’t required and leaves those desiring a shorter validity period in an awkward position)
  • there is no shared cache amongst all instances as it sits in memory, so you end up with inconsistent responses and extra load on ocsp responders (as each must make their own request)
  • the logic requires an ocsp responder, which technically isn’t required to produce an ocsp response (this may seems a little crazy, but consider a self-signed issuer created by cert-manager…there is no responder url but ocsp data could be provided nonetheless)

EDIT: we use the ingress-nginx implementation, which is not the native module, it’s a lua based plugin builtin to the controller, I suspect they went that route due to limitations in the linked module

Those limitations should not be problems for EG, as it reconciles secrets changes in a real-time manner.

github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had activity in the last 30 days.