envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.52k stars 4.73k forks source link

Use per-resource version to avoid hashing protos in xDS updates #14367

Open htuch opened 3 years ago

htuch commented 3 years ago

xDS APIs today use hashes of incoming resource protos to compare against the existing resource version when determining if any updates needs to be made. This is expensive (we're measuring ~8% CPU in some workloads), since we need to flatten to text for Envoy-canonical proto hashing (protobuf doesn't do its own hashing).

It's also unnecessary when delivered resources are provided via delta xDS (or even SotW when we encapsulate in a Resource object). In these cases, we have per-resource versioning that can play the same role as the hash without any computational requirement.

Opening this issue to track potential optimization work here. I think the main value is for SotW+Resource wrapper, since delta xDS will be pushing the diffing to the server-side anyway, so there isn't a lot of saving.

FWIW, I think we had a thread on this ages ago but I can't spot it.

rojkov commented 3 years ago

IIUC this was discussed in https://github.com/envoyproxy/envoy/issues/8301 and https://github.com/envoyproxy/envoy/pull/8231#discussion_r324191998.

What would be the per-resource version here? A new field for UUID in e.g. envoy.config.cluster.v3.Cluster?

As an alternative @mattklein123 mentioned a message differencer. I guess this is about Protobuf's MessageDifferencer.

htuch commented 3 years ago

We would wrap all resources in the SotW repeated google.potobuf.Any resouces with a Resource message. This would provide a per-resource version. This has been discussed elsewhere, such as in per-resource TTLs for SotW. CC @markdroth @envoyproxy/api-shepherds

markdroth commented 3 years ago

I would really like to see us move away from depending on the system version for ACK/NACK semantics, since IMHO that makes the protocol semantics incredibly hard to understand and reason about. Ideally, we would do this by migrating everyone to the incremental protocol variants and finally deprecating SotW. But if we don't think that's feasible any time soon (which would not be surprising), the next best thing would be to change the SotW protocol to use the Resource wrapper and adopt the incremental API's ACK/NACK semantics.