Open zyfjeff opened 4 years ago
/assign @zyfjeff
@zyfjeff this sounds reasonable to me as long as we don't break anyone's existing code. cc @htuch @lizan for comment on this as they are more familiar.
I think there's two questions here:
I think any change to (1) would be breaking, so I don't think we can do this within a major version without a deprecation dance. Also, I am hesitant to allow protobuf implementation details guide decisions here, as this is not stable.
For (2), I think we can move to a more efficient internal representation. I have no strong opinions on what this looks like, the common case should be easy to optimize (empty) and the reasonably common (simple key-value map) should also be straightforward.
This way of using Any to nest a Struct internally reduces the memory footprint and the code changes are small, but it still takes up a lot of memory space.
message MetadataV2 {
// Key is the reverse DNS filter name, e.g. com.acme.widget. The envoy.*
// namespace is reserved for Envoy's built-in filters.
map<string, google.protobuf.Any> filter_metadata = 1;
}
tutorial::MetadataV2 meta;
google::protobuf::Struct struct_obj;
auto& fields_map = *struct_obj.mutable_fields();
fields_map["test_key"] = stringValue("test_value");
google::protobuf::Struct struct_inner;
(*struct_inner.mutable_fields())["inner_key"] = stringValue("inner_value");
google::protobuf::Value val;
*val.mutable_struct_value() = struct_inner;
fields_map["test_obj"] = val;
google::protobuf::Any any;
any.PackFrom(val);
(*meta.mutable_filter_metadata())["com.test"] = any;
std::cout << meta.SpaceUsedLong() << std::endl;
tutorial::MetadataV2 meta1;
std::cout << "empty:" << meta1.SpaceUsedLong() << std::endl;
Output:
367
empty:144
I share @htuch's view that metadata can be stored in a more efficient way, and I'll follow up on that. The main work includes the following two points:
@lizan What do you think?
Now our host metadata using the Struct way, this way is so flexible, users can configure any type, but right now we're actually subset is simple as a Map
, the key is a string, the value is google::protobuf::Value
, does not understand its actual content, at present only a subset of list_as_any will understand the List type. Do we need to add a new field for endpoint metadata to avoid abuse?
@zuercher can probably chime in on how subset LB has made use of host metadata.
You're correct that the subset LB only handles values that are simple types or lists of simple types. From that load balancer's point of view the metadata could be stored more efficiently.
Endpoint metadata can be emitted in access logs, which renders structs in the metadata as JSON objects. Further, extensions may use data stored under other namespaces besides the standard "envoy.lb".
Finally, that Envoy's Metadata object is used in a number of places so there's even more potential impact.
@zuercher thanks,I just want to replace the metadata in the endpoint, and I don't intend to change the metadata used elsewhere.
@zyfjeff you should aim to preserve both the API wire format (so, still structs in endpoint
for v2) and also the output format, i.e. the same JSON format.
@htuch The wire format unchanged, it seems that the difficulty, we can add an EndpointMetadata
field, use this field to implement the subset, access log, etc., the original Metadata fields related implementation remains unchanged, set to deprecated.
@zyfjeff it's still unclear to me the motivation for the change. There are two things going on:
You can fix (2) without (1) for the common case where metadata is just a simple key:value string map. For (1), have we surveyed all uses of endpoint metadata? I know that metadata is used by us internally for many things, beyond just matching and access logs (but maybe not for endpoints). It's generally a placeholder that folks tack additional config information, including provenance of configuration. This can become deeply nested. So, I'd want to be confident that there are no legitimate uses of nested metadata on endpoints before making the EndpointMetadata
move. We can still solve (2) without this.
@htuch In order to avoid overshooting, let's solve (2) in the short term, In addition, I think we can share the same metadata and manage the lifecycle by reference counting.
Title: Host metadata take up a lot of memory
Description:
The following code is my test
Output:
Is it possible to use a Map instead of a Struct, or to use Any nested Struct instead? In addition, we can actually optimize the default values so that we don't have to build Metadata objects when we don't have any metadata information.
@mattklein123