tokio-rs / prost

PROST! a Protocol Buffers implementation for the Rust Language
Apache License 2.0
3.67k stars 479 forks source link

Adding Reflection and DynamicMessge Support #301

Open sphw opened 4 years ago

sphw commented 4 years ago

I'm hoping we can get the ball rolling on reflection support for Prost. Our use case is a document store using Protocol Buffers as the storage / communication format. We would also like to create a Rust version of https://github.com/google/cel-spec.

Our goal is to setup reflection in an Rustful idiomatic way. There was some discussion https://github.com/danburkert/prost/issues/235 about this. The C++ API is super unwieldy, and has accessors for every type. Their goal is to reduce runtime overhead when accessing individual fields. If you have the Field Descriptor you can almost instantly get a reference to its value. Rust provides us some easy solutions to these problems. My ideal API would be something like below

trait Reflection {
    fn entry<'a>(&'a self, label: &str) -> Option<Entry<'a>>;
    fn fields(self) -> &[Field];
}

enum Value<'a> {
    Int32(&<'a>i32),
    // With the rest of the protobuf types
    Message(Box<&<'a>Message>),
    Repeated(&<'a>[Message])
}

struct Field {
    label: String,
    nimber: usize
}

struct Entry<'a> {
    value: Value<'a>,
    field: &'a Field
}

I like the general shape of this API since it mostly returns references to the actual data. We have to instantiate a new Value on every access, but I think the overhead of that is low enough to be worth it. This could be extended to support mutable versions as well. I am not sure about the naming I have used here. I used entry because its reminiscent the HashMap API, but it could also get confusing. I am using the trait Reflection here as more of a way to organize things than anything else. These methods could be directly added to Message, or Reflection could be kept.

In my mind DynamicMessage's API should be fairly straight forward. It could take a FileDescriptorProto, and a buffer as arguments and then dynamically parse the message. It's implementation may require refactoring parts of prost-build into a more general library that can be used by codegen and DynamicMessage

I would love feedback / concerns about the general API I have proposed. If anyone has other designs I would love to see them. After we have settled on a design either me or another developer from my company would be happy to implement it.

ajguerrer commented 4 years ago

@sphw Perhaps the newer Go implementation might be of more use than the C++ API. The whole repository represents the culmination of Googles struggle to find a good API and has a bunch of helpful documentations to study. Each concrete Message or Enum can call Descriptor which returns a view into the associated MessageDescriptor or EnumDescriptor respectively.

Note, Descriptor returns protobuf type information. There is also a similarType API which returns protobuf and Go type information. It may or may not be desirable to do something similar here.

Finally, lets's not forget about extensions which are in my opinion one of the biggest reasons to support reflection. It unlocks annotations for transcoding.

andrewhickman commented 2 years ago

I've started working on a crate for supporting reflection here https://crates.io/crates/prost-reflect

It supports taking a prost_types::FileDescriptorSet, inspecting the message types and encoding/decoding them as both bytes and JSON.

Currently it uses the undocumented prost::encoding::* APIs for encoding/decoding, is there any plan to stablilize those?

sphw commented 2 years ago

Hey all. I think our crate will be of interest to those of you in this thread. Soon after I posted this issue, I began working on a reflection library internally. We have just open-sourced the 2nd iteration of that library called looking-glass. Right now the docs are a touch anemic, but they will be improved soon. looking-glass itself can be used to add reflection to most Rust structs. looking-glass-protobuf supports two DynamicMessage style types: one owned and one borrowed.

Check it out if you want to use an API similar to the one in this issue: https://github.com/m10io/looking-glass

Cliftonz commented 1 year ago

@sphw Any updates on this?

sphw commented 1 year ago

@sphw Any updates on this?

No real updates. Looking glass that I linked above is still being maintained, but we haven't found any bugs or extra features.

Looking glass was my 2nd go at writing a reflection library for protobufs and Rust. Overall, it has made me think that it is difficult to create a single reflection library that fits everyones needs. One major difficult is picking and choosing between type systems. Protobufs and Rust have differing type systems, that while having significant overlap, aren't to exactly compatible. So you may wish to define your dynamic value to map directly to protobuf types, or you may wish to map them directly to Rust types. If you choose Rust types you will have a more ergonomic library, but will loose potential type information that maybe useful to users. Another issue is on zero-copy & lazy decoding. Often with reflection you might only care about a single field of a message. So it can be useful to only decode the fields the user needs. This again has trade-offs though, as it adds complexity to the call-sites of the getter and to the decode functions. Looking Glass side-steps this by providing 2 structs MessageView and DynamicMessage.

That's not to say that there isn't some sweet spot of reflection library that Prost can implement. But it is something that will need a good deal of iteration and work. If you need a reflection library ASAP take a look at looking-glass. prost-reflect also looks good, though it lacks some of the zero-copy features that I discussed above. I'd also recommend to anyone designing a Rust reflection library to take a look at rebound. Its design heavily influenced looking-glass, and it has interesting ideas on what a Rust reflection library can look like.

banool commented 1 year ago

Perhaps bevy reflect could be used here: https://docs.rs/bevy/latest/bevy/reflect/index.html. I haven't used it much myself but people who know more than I say it's a very powerful library.

Cliftonz commented 1 year ago

The main thing I am looking for is this to be done so Tonic can support also generating an http endpoint with the grpc endpoints.

Cliftonz commented 1 year ago

@banool Thanks for the recommendation!

banool commented 1 year ago

That's my intended use case too 😃

Cliftonz commented 1 year ago

@sphw Does this change anything for you on your recommendation?

sphw commented 1 year ago

@sphw Does this change anything for you on your recommendation?

I'm going to give an annoying answer, it depends. If your goal is to simply reflect on Rust structs generated by Prost, bevy-reflect looks great. Though I don't see anything particularly unique in its approach.

There are really three different use cases being conflated here under the banner of "reflection"

  1. Reflection on Rust types - This is handled well by bevy-reflect and most pure Rust reflection libraries
  2. Reflection on arbitrary protobuf messages - Say you have a bunch of bytes and you only know their type at runtime. You can use what looking-glass and prost-reflect call a DynamicMessage to inspect those fields and values. One could write a system to translate protobufs into bevy-reflect, but it would not be very efficient, and would miss out on all the optimizations I discussed in my last comment.
  3. Convenient message description data - Protobuf definitions are themselves protobufs. Which is weird, but it allows you to get type information about the underlying protobuf definition at runtime. This is available right now in Prost, it's just not very convenient. Ideally Prost would develop a more user-friendly API akin to the C++ or Go API.

Now onto the question of grpc-web / grpc-gateway style reflection. There are basically two use cases for reflection for generating a gRPC JSON proxy. One is to read the protobuf definition to generate HTTP routes for each gRPC method. In addition a feature called "options" is used by most gRPC JSON conversion systems to allow a user to customize the conversion. There is a discussion on the best way to support that here https://github.com/tokio-rs/prost/issues/674 . One solution to the discussion in the aforementioned issue is to use the type of reflection type number 2. But it isn't the only option, and might not even be preferable. You could also theoretically use reflection to actually generate JSON at runtime. But generally in Rust we prefer to use serde for JSON serialization, so it isn't necessary for that use case.

In conclusion, if you want JSON support in Tonic, this issue isn't really the blocker. It could be used to implement it, but https://github.com/tokio-rs/prost/issues/674 is the true blocker