envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.9k stars 4.79k forks source link

RFC: Enable TLS external session cache #14553

Open LuyaoZhong opened 3 years ago

LuyaoZhong commented 3 years ago

Title: Enable TLS external session cache

Description:

TLS provide some session resumption policies to do a quick handshake. Envoy support "session ticket" and "session id" policies when TLS version is older than 1.3. For "session id", Envoy stores the session in the memory on server side(internal session cache). TLS also support external session cache which would be a enhancement for Envoy TLS session management. That means if the TLS session is not found in internal storage or lookups for the internal storage have been deactivated, the server will try the external storage if available. With external cache supported, Envoy could be stateless and still can leverage session resumption, and session could be reserved for longer time, and a session could be shared between multiple Envoy proxies.

I have a initial idea about design and implementation, the API in my mind is probably like this: extensions.transport_sockets.tls.v3.DownstreamTlsContext { "common_tls_context": "{...}", "require_client_certificate": "{...}", "session_ticket_keys": "{...}", "session_ticket_keys_sds_secret_config": "{...}", "disable_stateless_session_resumption": "...", "session_timeout": "{...}", "ocsp_staple_policy": "...", "external_session_cache": "" // introduce this new field extensions.transport_sockets.tls.v3.TlsExternalSessionCache } extensions.transport_sockets.tls.v3.TlsExternalSessionCache { "session_storage_type": "redis", // redis is one of the conventional choices for external cache "session_storage_cluster": "redis_cluster" }

[optional Relevant Links:]

Session Caching

mattklein123 commented 3 years ago

cc @ggreenway @PiotrSikora

I think it would be interesting to look at external session caches, but I would definitely want this to be pluggable from the start.

ggreenway commented 3 years ago

Makes sense to me. +1 on making it pluggable. One option is just a simple gRPC service, and adapters to a backing store (redis, etc) are the responsibility of the user.

LuyaoZhong commented 3 years ago

@ggreenway @mattklein123 Thanks for your comments. Could you give me some guide or references? I am a newbie in Envoy project, it will be great if I can contribute this feature to community.

Enabling external session cache needs setting some callback functions to SSL context, I'm thinking about it has to modify the main code in transport socket tls extension. So do you mean only making the external session storage as a gRPC service and it provides API to connect to concrete backing store?

mattklein123 commented 3 years ago

Sorry for the long delay in response.

So do you mean only making the external session storage as a gRPC service and it provides API to connect to concrete backing store?

Yes, exactly. I think if we develop a gRPC API that can be optionally called from the TLS code, this will allow arbitrary backend implementations. We have many examples of APIs at this point that are "side-calls" of this nature. Take a look at: 1) https://github.com/envoyproxy/envoy/blob/main/api/envoy/service/ratelimit/v3/rls.proto and how it's used in the various rate limit filters (both network and HTTP) 2) https://github.com/envoyproxy/envoy/blob/main/api/envoy/service/metrics/v3/metrics_service.proto and how it's used in the metrics service stat sink 3) https://github.com/envoyproxy/envoy/blob/main/api/envoy/service/ext_proc/v3alpha/external_processor.proto and how it's used in the new ext_proc HTTP filter which allows a gRPC API/backend to implement some of the HTTP filter semantics.

^ should hopefully be enough to get you started but feel free to ping back if you have any questions and we can help more. Also, for this feature I would recommend doing a small gdoc with a proposed design that we can discuss. Thank you!

LuyaoZhong commented 3 years ago

@mattklein123 Thank you for providing these references, I‘ll draft gdoc when I get time, it might take some time since I'll take a vacation about one or two weeks, I'll update when I'm back. Thanks. :)

LuyaoZhong commented 3 years ago

Sorry for long-time no updates due to some personal emergent work. @mattklein123 @ggreenway Do we have a template for design doc?

mattklein123 commented 3 years ago

We don't have a fixed template right now. I would just type up something in whatever format you want and we can go from there. I would just cover standard design doc stuff (problem statement, goals, non-goals, high level design, etc.)

LuyaoZhong commented 3 years ago

@mattklein123 Hi, I draft a minimal design doc, welcome to review. Design Doc

ggreenway commented 3 years ago

I think it would be good to discuss if/how this would be applicable to TLS 1.3, and how this compares to session tickets (ie when would you use a session cache instead of session tickets).

LuyaoZhong commented 3 years ago

@ggreenway @mattklein123 I add a background section to answer the questions from you.design doc Besides, I would like to consult you where I should start my PoC. I figure out how to add a config API. But for gRPC service, I add my protobuf file but it doesn't generate C++ headers after I build envoy. And I am not very clear about how to write a gRPC client based on it. I wrote an independent gRPC client before but I don't know how to make it work with Envoy. Do you have any guide docs? Thanks in advance.

If possible I would like to add more details into design doc such as interface definition, it might help me coding. :)

LuyaoZhong commented 3 years ago

Hi, @ggreenway @mattklein123 , I'd like to start with a PoC first and polish my design doc at the same time. I have almost done the API part. And currently I get stuck at generating pb.h files from my proto file(I need to implement a envoy built-in grpc client based on current design), is there any tool in envoy to help that?

ggreenway commented 3 years ago

proto compilation is done by the build system. For an example, look at the ratelimit filter (source/extensions/filters/common/ratelimit). In the BUILD file, it specifies a dependency on the protos in the deps via @envoy_api//envoy/service/ratelimit/v3:pkg_cc_proto. You can see that grpc service defined in api/envoy/service/ratelimit/v3/rls.proto, and the BUILD file in that directory.

LuyaoZhong commented 3 years ago

proto compilation is done by the build system. For an example, look at the ratelimit filter (source/extensions/filters/common/ratelimit). In the BUILD file, it specifies a dependency on the protos in the deps via @envoy_api//envoy/service/ratelimit/v3:pkg_cc_proto. You can see that grpc service defined in api/envoy/service/ratelimit/v3/rls.proto, and the BUILD file in that directory.

Thanks.

LuyaoZhong commented 3 years ago

@ggreenway I'm a little confusing about the Envoy built-in grpc client, e.g. ratelimit and ext_proc, they all implemented a async client, it seems that the request and response are seperated, but for tls external session, I need get the response immediatly after I send out the request since I need that session to do the quick handshake, am I supposed to implement a sync client? Do you have any suggestion?

ggreenway commented 3 years ago

For it to work properly, you must be able to treat it as async. If you try to make it sync, an envoy worker will be blocked waiting on the response, and performance will be unacceptable.

l8huang commented 1 year ago

@LuyaoZhong what's the current status of this RFC? Thanks

soulxu commented 1 year ago

@l8huang I think @LuyaoZhong is already moving the interesting. If you are interesting on this, you can free to take it.

zhangbo1882 commented 3 months ago

I pick up the task. I submit a initial PR https://github.com/envoyproxy/envoy/pull/35014