envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.99k stars 4.81k forks source link

why xDS control plane/servers must remember the set of resources being requested by the client even if the resource does not exist? (with new question in the comments) #31071

Closed logstashbugreporter closed 11 months ago

logstashbugreporter commented 11 months ago

Title: why xDS control plane/servers must remember the set of resources being requested by the client even if the resource being request does not exist?

Description: https://www.envoyproxy.io/docs/envoy/latest/api-docs/xds_protocol#knowing-when-a-requested-resource-does-not-exist per envoy online doc, "Management servers must remember the set of resources being requested by the client, and if one of those resources springs into existence later, the server must send an update to the client informing it of the new resource. Clients that initially see a resource that does not exist must be prepared for the resource to be created at any time."

I'm curious about the rationale behind this decision. It appears to involve a design choice that introduces extra bookkeeping on the server side, potentially making it much more complex. On the other hand, clients could potentially handle this edge case more easily by retrying. This approach seems to add complexity to the server, and I'm interested in understanding the reasoning behind it, especially considering the seemingly straightforward alternative of letting clients manage retries. I must have missed something here?

yanavlasov commented 11 months ago

Adding @adisuissa for any comments.

adisuissa commented 11 months ago

cc @envoyproxy/api-shepherds for people who may have been there when this was established.

My take on this is that xDS is essentially a pub-sub protocol, and it is up to the server to manage the subscriptions from its clients. Additionally, if the client were to retry, the logic there would be basically switching to a polling strategy.

logstashbugreporter commented 11 months ago

@adisuissa thank you for sharing your insights! The idea that "it is up to the server to manage the subscriptions" makes complete sense when a client subscribes to existing legitimate resources. However, if the server has to keep track of any arbitrary resource, it opens the door to potential DDOS attacks where a client could repeatedly subscribe to random resources.

Moreover, if a client misspells a resource name, this server behavior will mislead the client. The client won't receive the correct feedback/error message in the return value, making it challenging to correct mistakes. On the contrary, the server could simply return an error message when the client subscribes to a non-existent resource. This approach allows the client to recognize its mistake, prompting it to recheck the resource name or retry the operation later.

In other pub-sub systems, e.g., Kafka, the client will receive an error if it tries to reach non-existing topic.

markdroth commented 11 months ago

As @adisuissa said, the protocol is intended to be a pub/sub mechanism. The client controls what set of resources it is subscribed to, and the server is responsible for sending the most recent contents of all of the resources that the client is subscribed to whenever those resources change. There's a good description of this in xRFC TP2.

For example, let's say that when the client initially subscribes to resource R, the resource's value is foo, so the server immediately sends R=foo to the client. Now the resource's value is changed to bar. At this point, the server must immediately send R=bar to the client, since the client is subscribed to the resource. The client does not know when that is going to happen, and it is not required to send a new request to the server to get that update; the server is responsible for tracking the set of resources that the client is subscribed to and sending those updates as they occur.

In this model, "does not exist" is just a special value of a resource's contents. If you replace "foo" in the above example with "does not exist", the protocol works the same way.

I think this approach makes sense, because there are cases where a resource may be removed and then recreated later. I don't think it makes sense to have a different mechanism for "does not exist" than we do for any other value of a resource; this would complicate the client-side code by requiring it to do some sort of retry, which isn't required for any other resource update.

I do agree that one of the weaknesses in the xDS SotW protocol variants is that they don't have a clear way to indicate that a resource does not exist, which is why the 15-second timeout needs to be applied on the client side. But the incremental protocol variants fix this by adding an explicit mechanism to indicate that a resource does not exist. So fundamentally, the client still knows when this is happening, just as it would if the server returned an error message.

With regard to DDoS attacks, note that individual xDS servers define their own resource naming namespaces, so it should be totally possible for an xDS server to identify resource names that will never exist and to just ignore those names. There's no point in an implementation tracking a name that it knows will never exist.

In any case, I don't think this is something we could change now even if we wanted to. This is the established behavior of the xDS protocol, and there are way too many xDS server and client implementations that already exist and have the currently defined behavior, so we're not going to be able to change this.

logstashbugreporter commented 11 months ago

@markdroth Thanks for shedding light on the history. I'm still trying to understand, so I apologize for my slowness.

  1. I'm curious about the statement, "I don't think this is something we could change now even if we wanted to." Does this mean the xDS protocol won't evolve over time? Will it remain stuck in version 1 until it reaches its end of life?

  2. Regarding your comment on "this would complicate the client-side code by requiring it to do some sort of retry," I believe that client-side retry is necessary for network interruptions and other dynamics, independent of the issue being discussed here. If we forgo this feature in our home-brew xDS server (i.e., the server returns an error message and doesn't keep track of the resource for non-existing requests), would it still work seamlessly with the Envoy client for our internal service discovery? This is my top concern. If I understand correctly, Envoy will send another subscription request if the previous subscription request does not result in a route, and this occurs upon the subsequent request from upstream. I would like to double check on this, thanks! @adisuissa

  3. I'm also puzzled by this comment: "individual xDS servers define their own resource naming namespaces, so it should be totally possible for an xDS server to identify resource names that will never exist and to just ignore those names." Our service/resource names could be arbitrary strings, so I'm not sure how the xDS server can identify non-existent resource names unless in some constrained and contrived scenarios.

markdroth commented 11 months ago
  1. I'm curious about the statement, "I don't think this is something we could change now even if we wanted to." Does this mean the xDS protocol won't evolve over time? Will it remain stuck in version 1 until it reaches its end of life?

The protocol can certainly evolve over time (although this is typically done more via some lightweight negotiation than by versioning -- e.g., if the client populates a given field in the request, the server knows that the client supports the new feature and can react accordingly). But I think the thing you're asking about here is such a fundamental part of the protocol that changing it would be incredibly disruptive to the existing ecosystem, and I don't think this is something we'd be willing to do without a really compelling reason, probably something that was likely to affect the vast majority of clients/servers. The fact that there are so many existing clients and servers that adhere to the current protocol and don't have problems suggests that this change would not fall into that category.

  1. Regarding your comment on "this would complicate the client-side code by requiring it to do some sort of retry," I believe that client-side retry is necessary for network interruptions and other dynamics, independent of the issue being discussed here.

Clients retry creating the stream if the stream fails, but they don't retry individual subscription requests on a given stream. The protocol dictates that once a client has subscribed to a given resource on a given stream, the server is responsible for sending updates whenever that resource changes for the duration of that stream. There is no retry from the client in this case, so servers cannot expect clients to do that.

If we forgo this feature in our home-brew xDS server (i.e., the server returns an error message and doesn't keep track of the resource for non-existing requests), would it still work seamlessly with the Envoy client for our internal service discovery? This is my top concern. If I understand correctly, Envoy will send another subscription request if the previous subscription request does not result in a route, and this occurs upon the subsequent request from upstream. I would like to double check on this, thanks! @adisuissa

I don't know of any such behavior in Envoy, although I'm not an expert in Envoy's implementation. But I can say that if Envoy does do this, it's essentially violating the protocol spec. gRPC definitely does not do this, nor would any other xDS client be expected to do so.

  1. I'm also puzzled by this comment: "individual xDS servers define their own resource naming namespaces, so it should be totally possible for an xDS server to identify resource names that will never exist and to just ignore those names." Our service/resource names could be arbitrary strings, so I'm not sure how the xDS server can identify non-existent resource names unless in some constrained and contrived scenarios.

The xDS server defines the values of those strings, so you could impose restrictions that would help avoid the problem of memory usage by an arbitrary attacker subscribing to arbitrary resource names. For example, you could say that all Listener resource names will start with the prefix "listener/", in which case you can ignore any subscription for a Listener resource that does not have that prefix. I was just pointing it out as a potential approach you could use to help mitigate your concern about the DDoS attack vector you described.

markdroth commented 11 months ago

BTW, I should point out that there are some really trivial xDS servers that use a "pure SotW" approach, where they always send every single resource in every single update. These servers don't scale very well, because a change in a single resource requires sending a huge update to the client. But it is trivial to implement such a server, and it does not require the server to track the set of resource names that the client is subscribed to -- it just sends all of the resources it knows about to all clients.

The real requirement here is that the server guarantees to send updates to the client when the resources that the client is subscribed to changes. You can do that in a lot of ways, some of which require more granular tracking than others. Ultimately, you do need to get fairly granular if you want to scale, but if you're operating strictly at a small scale, you can probably get away without it.

logstashbugreporter commented 11 months ago

@markdroth Thanks for the reply! I completely agree with you that the "pure SotW" approach won't scale, so we're implementing a delta approach. Regarding your last comment, "Ultimately, you do need to get fairly granular if you want to scale," I'm curious about the status of the LEDS protocol (for better scaling). Has it been officially adopted or endorsed as an xDS protocol enhancement? I noticed it's partially implemented in Envoy, but is it also going to be supported in gRPC proxy-less implementations?

markdroth commented 11 months ago

I think LEDS is fully supported in Envoy. @adisuissa should be able to confirm that.

gRPC does not currently support either the incremental protocol variants nor LEDS, but I would very much like to see us support those things in the long run. I think it's really just a question of when we encounter a concrete use case that requires that functionality, which would cause us to prioritize the work.

adisuissa commented 11 months ago

LEDS is supported by Envoy (Delta-mode only, no plans to have SotW support).

logstashbugreporter commented 11 months ago

@adisuissa Just to double-confirm, is LEDS in Envoy fully functional in its current state, without requiring any further changes? I noticed a past commit by you (https://github.com/envoyproxy/envoy/pull/17869) stating, "The implementation of the main logic behind the LEDS protocol. This is not integrated into ClusterLoadAssignment processing (not plumbed in yet)." Is there another subsequent commit that added the required plumbing?

logstashbugreporter commented 11 months ago

HI, @markdroth I'm delighted to hear that you consider LEDS to be an important and worthwhile endeavor for the long run. Currently, we have a concrete use case that urgently requires this functionality. Our large service cluster, comprising over 3,000 nodes, is causing significant GC pressure on the clients due to the absence of LEDS or a patch mechanism in the existing EDS. Given the immediate impact on real-world production systems, prioritizing this work would have a high impact and would be greatly appreciated:) I'm actually surprised that this use case hasn't been brought up before!

adisuissa commented 11 months ago

@adisuissa Just to double-confirm, is LEDS in Envoy fully functional in its current state, without requiring any further changes? I noticed a past commit by you (#17869) stating, "The implementation of the main logic behind the LEDS protocol. This is not integrated into ClusterLoadAssignment processing (not plumbed in yet)." Is there another subsequent commit that added the required plumbing?

Thanks for pointing that out. It was plumbed in https://github.com/envoyproxy/envoy/pull/18195, but missed adding that PR to the list.

markdroth commented 11 months ago

@logstashbugreporter If you want that feature in gRPC, please file an issue in the appropriate gRPC repo. Since you mentioned GC, I'm guessing that you're talking about Java, so it would be the https://githb.com/grpc/grpc-java repo.

PapaCharlie commented 11 months ago

@markdroth this is excellent news! If LEDS is implemented in Envoy, should these fields be unhidden then? https://github.com/envoyproxy/envoy/blob/main/api/envoy/config/endpoint/v3/endpoint_components.proto#L138 Is there an official guide anywhere other than the linked google doc on https://github.com/envoyproxy/envoy/issues/10373? Should that issue also be closed?

markdroth commented 11 months ago

I'll let @adisuissa comment on those questions about LEDS in Envoy.

logstashbugreporter commented 11 months ago

@logstashbugreporter If you want that feature in gRPC, please file an issue in the appropriate gRPC repo. Since you mentioned GC, I'm guessing that you're talking about Java, so it would be the https://githb.com/grpc/grpc-java repo.

@markdroth Awesome, thanks for the pointer! After internal discussions, we'd like to prioritize having the LEDS feature in go-gRPC first. We already have in-house service discovery Java clients working in conjunction with Java gRPC, and we plan to continue using this setup for various reasons. I'll be filing an issue request in the https://github.com/grpc/grpc-go repo. I'm also curious if we need to file a request for gRPC RFCs as well at https://github.com/grpc/proposal/tree/master#grpc-rfcs?

markdroth commented 11 months ago

@logstashbugreporter Filing the bug in the grpc-go repo sounds fine if that's the language you want to prioritize it in. The first step will definitely be putting together a gRFC that defines how the feature would be implemented in a cross-language way, but you don't need to file a separate issue for that -- we can drive that from the issue in the grpc-go repo.

Also, just to set expectations, I didn't mean to imply that we'd be able to automatically prioritize this work when you file the feature request. Unfortunately, we have a lot of projects competing for our time, and this project will be part of that competition, just like all the others. There are a bunch of factors that go into prioritization decisions, such as how important a project is strategically (this one is valuable in that regard), how large of a project it is (this is likely to be on the large side), and how many users are interested in the feature (this is why filing the feature request is a good first step!). It may wind up being that the fastest way to make progress on this would be for you to drive the work, while we provide reviews on the gRFC and the implementation.

In any case, though, filing the feature request is still the best first step. That gives us a useful signal that this is a feature that someone wants, and it will allow others to chime in on the same issue, which could raise its priority. We can figure out the details from there.

Please tag me when you file the issue, and we'll figure out the best way to move forward. Thanks!

logstashbugreporter commented 11 months ago

FYI, @markdroth I filed a feature request and tagged you in the proposed solution section. Thank you for providing guidance and addressing the gRFC aspect of this!

I totally understand the various projects are competing for resources, and I'm curious about the current ownership structure of the gRPC project. Specifically, is it still owned by Google, and does it predominantly rely on Google engineering resources, or does it operate with contributions from volunteers in their free time?

Unfortunately, we have very limited engineering resources compared to Google's, making it challenging for us to drive the implementation at this scale. However, I share your confidence in its impact on real-world scenarios, and I believe it will significantly contribute to driving xDS/gRPC/Envoy ecosystem adoption in the industry. Without a scaling mechanism, I'm curious about how existing adopters cope with large clusters and whether this limitation acts as a constraining factor for broader adoption. I'd appreciate it if you could keep us updated on the progress. Also, thanks @adisuissa for the pointers!