proto: make the Message interface behaviorally complete

dsnet commented 7 years ago

Filing on behalf of @alandonovan.

The proto.Message interface is unsatisfactory. A behavioral interface is an abstraction over the underlying data types that exposes just the operations necessary for their correct use. By contrast, proto.Message exposes essentially no useful functionality and serves only as a marker. But a marker of what? If one could assume all its implementations were protoc-generated struct types with field tags, then at least it would be possible to write reflection-based algorithms that do useful things. However, there are many concrete types satisfying Message that are not of this form.

It's not only the set of implementations that is unbounded; the set of useful operations is also large and growing. The two most important, Marshal and Unmarshal, are handled quite cleanly since there are separate behavioral interfaces for Marshaler and Unmarshaler that allow each concrete type to implement these operations. But there are many functions in the proto API, for which no interface exists: proto.Merge, proto.Clone, the extensions API, and so on.

The cross-product of concrete implementations and operations is growing, but the fraction of these combinations that actually work is diminishing.

I think we should assess what it would take to change the proto.Message interface, and all its implementations, so that it is a true behavioral interface. This would require at a minimum that the interface include a new method that provides complete control over the abstract state of a message: accessing and updating its fields, inspecting any extensions or unrecognized fields, and so on, without revealing the concrete representation. It should be possible to implement all the major functions in the proto API, as well as most users' ad hoc functions, in terms of this interface so that they work with any concrete implementation. If an optimized version of a crucial operation is available, the generic implementation should dispatch to it, as it does today for Marshal and Unmarshal.

We can't add methods to proto.Message without breaking backwards compatibility. One approach we can take is to define proto.MessageV2 that is a much more semantically complete interface that provides a form of "protobuf reflection". In Marshal, Unmarshal, Merge, Clone, and so on, we can type assert if it implements proto.MessageV2 and use that interface to implement generic versions of those functions. If proto.Message doesn't satisfy proto.MessageV2, then Merge can just fail (it already does on most third-party implementations of proto.Message).

jhump commented 7 years ago

Interesting proposal. I faced a similar challenge when designing the public API for a dynamic message implementation.

If this goes anywhere, I'll be curious to see how similar they are in terms of shape/surface area. And I'll be excited to see how it can simplify that dynamic message's implementation (particularly around extensions, which I think is the weakest part of the current generated code).

xfxyjwf commented 6 years ago

About adding methods to proto Message interface, we have done exactly that a few times in both C++ and Java. For example, we added a ByteSizeLong method to C++ MessageLite interface not very long ago: https://github.com/google/protobuf/blob/master/src/google/protobuf/message_lite.h#L339

It's a pure virtual method so anyone who implements their own MessageLite subclasses will be broken.

Protobuf team's stance on this is that nobody except protobuf itself should subclass/implement these message interfaces. It's called out explicitly in our compatibility notice in Java: https://github.com/google/protobuf/tree/master/java#compatibility-notice

Protobuf message interfaces/classes are designed to be subclassed by protobuf generated code only. Do not subclass these message interfaces/classes yourself. We may add new methods to the message interfaces/classes which will break your own subclasses.

dsnet commented 6 years ago

Go doesn't have the concept of a "default method", so this is going to be a breaking change unfortunately. The transition to get the world to the new API will be a little tricky and will probably have to occur in several phases.

dsnet commented 6 years ago

There's an open proposal for default methods golang/go#23185, but it's not looking promising as the concept doesn't fit well into Go where interface satisfaction is implicit rather than explicit.

dsnet commented 6 years ago

Here are some documents for the proposed plans to improve Go protobufs. I'd appreciate any feedback from the community regarding the design and migration plan:

\cc @neild @jhump @awalterschulze @tamird @meling

luna-duclos commented 6 years ago

Couldn't the same approach to the database/sql package be taken ? Where various functions run specific assertions on smaller interfaces, rather than having a giant V1/V2 interface where both have all the functions defined in one big bundle.

Edit: This already seems the case in the new proposal, apologies for skimming over it too quickly

awalterschulze commented 6 years ago

Both documents are really ambitious, but in principal sound like a great idea. Good work.

I am a bit concerned about the amount of work required to make this move on the part of gogoprotobuf. I will need some help. The last large move (currently on the dev branch) was a big job, taking ALOT of my personal time, which I am trying to focus om my studies. I am not working for a company that uses protobufs. But I want to stress that I do think these changes are necessary. So I will do my best to support these efforts.

Comments on protoreflect. It looks good, but I am a bit concerned, that the following cases might not have been taken into account. So maybe here are some tests for the protoreflect API:

protobuf field type is bytes, but user has customized it to their own customtype Uuid
protobuf field type is repeated message, but user has customized it to []MyMessage, without a pointer field.
protobuf field type is timestamp, but user has customized it to time.Time or *time.Time or []time.Time or []*time.Time

I think getting this protoreflect library right is essential to having only a single proto library, which is the goal I am trying my best to support.

dsnet commented 6 years ago

Thank you @awalterschulze for your feedback.

Both documents are really ambitious

I admit that the designs are indeed ambitious. Ideally, it would be nice if something like it occurred 8 years ago when Go protobufs were first implemented. However, that was not the case and we are struggling with the consequences today. As much work as this transition is, it will lay the foundation for a brighter future. Another 8 years from now, I hope we look back and see this moment as the point when Go protobuf support became much better.

I will need some help.

I'll help 👋.

Regarding your protoreflect concerns, I was consciously thinking about how gogo/protobuf fit into it all as I was designing the reflection API.

How do we handle slices of message values (not pointers)?

This case is actually the reason why protoreflect.KnownFields.Set has this documented restriction:

It is unspecified whether Set performs a pointer copy or copy of the pointed at value for composite types (e.g., Messages, Vectors, Maps).

Thus, for []T where T is a value type, the reflection implementation only need to copy the pointed-at value.

The Mutable methods are also implementable since every element in a Go slice is addressable, so the reflection implementation can obtain an address for an element and return it.

How do we handle message values (not pointers)?

This situation is different from above since proto semantics do not indicate the difference between a null or empty message within a repeated list of messages. However, for a standalone message, proto semantics do preserve whether a message is null or empty (even in proto3).

Message values occurs when the nullable option is used. In this situation, proto semantics are being violated. Hence the documented warning that:

Warning about nullable: according to the Protocol Buffer specification, you should be able to tell whether a field is set or unset. With the option nullable=false this feature is lost, since your non-nullable fields will always be set.

An implementation of the reflection API will need to do a best effort at providing the illusion that the implementation is proto compliant.

However, this abstraction can leak:

Explicitly clearing a non-nullable field means that the message contents are cleared, but does not mean that the message itself is cleared.
Calling protoreflect.KnownFields.List on new instance of a message will return a non-empty list since some non-nullable sub-messages are always considered set.

How do we handle custom types for bytes?

Similar to the documented restriction on protoreflect.KnownFields.Set, I'll update the document such that it is unspecified whether protoreflect.ValueOf([]byte(...)) or protoreflect.Value.Bytes results in a slice that is aliased or copied.

How do we handle Timestamp or Duration proto represented as a Go Time or Duration?

An implementation of the reflection API will need to create internal wrappers over time.Time and time.Duration that presents the illusion that they are just messages with a seconds or nanos field.

One possible implementation is here: https://play.golang.org/p/IXvjCK_Y_Hc

However, these wrappers are leaky abstractions:

Timestamp.SetNanos with nsec outside the range of [0, 999999999] changes the value of Seconds and returns a Nanos that does not the same value as the original nsec. However, note that the documentation on Timestamp.nanos says that nanos must always be within the range of [0, 999999999], but this is a documentation restriction, not semantic restriction.
Duration.SetNanos with nsec outside the range of [-999999999, +999999999] changes the value of Seconds and returns a Nanos that does not the same value as the original nsec. However, note that the documentation on Duration.nanos says that nanos must always be within the range of [-999999999, +999999999], but this is a documentation restriction, not semantic restriction.
Calling Duration.SetSeconds with a sec value that has a different sign than Nanos, results in Nanos being negated and calling Duration.SetNanos with a nsec value that has a different sign than Seconds, results in Seconds being negated. In practice, this is not an issue because SetSeconds and SetNanos are almost always set together (with the correct signs) without an intermediate get on Seconds or Nanos.
The google.protobuf.Duration type is documented as being able to represent ±10000 years, which is beyond the range Go's time.Duration type, which can only represent ±290 years. Thus, calling SetSeconds with a sec value beyond Go's limit results in a silent truncation.
Duration has a similar non-nullable issues as non-nullable messages unless a *time.Duration was used. Note that time.Time does not have this issue since the zero value time.Time can represent being null. However, when the Mutable method is called to retrieve a time.Time, the timestamp must be initialized as Unix(0, 0), which is the zero value of the proto Timestamp message.

In practice, I don't expect the abstraction leakages mentioned above to be much of a problem in practice. If anything the overflow of time.Duration may be the most likely candidate, and the wrapper could help alleviate that by setting the time.Duration to time.Duration(math.MaxInt64) to help signal that failure scenario.

awalterschulze commented 6 years ago

I agree on the necessity and if you are willing to make contributions to the gogoprotobuf repo to bring the two implementations closer, to the point of eliminating the need for two proto libraries, then let's do this :)

On customtype, this was introduced first and the more sustainable extension casttype was only introduced later. customtype only assumes that you have implemented some methods, like Marshal, Unmarshal, Size, etc. It does not necessarily have to be an alias, as in the case of casttype. I hope we can still support this?

If we can hide the timestamp/duration wrappers from the user, then its fine, but I would hate for them to have to import something outside of the standard library to have a time.Time field. It is very important that the generated fields are the ones users want.

And yes, nullable=false and time.Duration are subtle leaky abstractions, but again it is what users want. If they were going to write a function to copy between the structs they want to use and the generated ones, then they were going to run into these subtletees anyway. gogoprotobuf prefers to handle these in a single sensible way over having users write error prone and slow copy functions.

Although these comments sound negative, these are really the only problems I can see. I think we can get there and I appreciate all your efforts here, including taking gogoprotobuf into account.

dsnet commented 6 years ago

For anyone subscribed. Work is actively being done to address this issue. See the "api-v2" branch on this GitHub repo.

neild commented 5 years ago

The in-development new package includes a definition of proto.Message which satisfies this feature request:

type Message interface{
    ProtoReflect() protoreflect.Message
}

The new Message type provides a behaviorally complete view of a message. Functions such as serialization (wire, text, and json), comparison, cloning, and so forth may all be implemented entirely in terms of this interface without any assumptions about the underlying type.

That's the good part. The downside is that we now have two types describing a message: The original Message and the new one. Existing packages which include the proto.Message type in exported functions cannot be changed to use the new type without a breaking API change.

We have two options at this point:

Keep what we have. The new package has a different Message type than the old one, generated types implement both, and users will need to pick between the two. We can provide convenience functions that convert between the two types.
Change the new package's Message type to be identical to the old one. We will provide a function which takes a proto.Message and returns a protoreflect.Message.

The first option gives us the best new API at the expense of making it more difficult for users to upgrade. The second option is an inferior API, but greatly simplifies the upgrade story.

Inside Google, the use of proto.Message in public APIs is pervasive. However, we have the ability to refactor the Google-internal codebase; the more interesting question is how common proto.Message is in external public APIs. If it is little-used, then the cleaner API becomes more appealing. If it is common, then the most compatible one does.

Either way, we're getting to the point where we need to make a decision. Thoughts from non-Google users would be particularly useful here.

jhump commented 5 years ago

@neild, consider this a vote for the first case, as it allows users to upgrade their dependencies incrementally (meaning 3rd party library dependencies that also depend on the proto runtime). If all of the newly generated messages are still compatible with the old packages/ABI, then they need only re-generate all of their protos to be able to take advantage of the new package's features.

But I wonder: if these convenience functions mentioned are given an old generated message (e.g. does not have any add'l methods to be easily convertible to the new interface), how would that be handled? Would there need to be some non-trivial translation layer? (I know this could be made to work, by interpreting the message's descriptor bytes and generating the reflection structures at runtime from that. But I'm unsure if that's planned or not.)

dsnet commented 5 years ago

Would there need to be some non-trivial translation layer?

This non-trivial translation layer already exists since we must support the situation where a newly generated message has a dependency on a old generated message that does not support the v2 API.

tandr commented 5 years ago

@neild - May I suggest of making 2 new posts or a 2 new issues for the sole purpose of collecting 👍 and 👎 ?

This way (without searching github for references to proto.Message) you can get a feeling of what people are interested in.

I personally would vote for a cleaner interface, but I also foresee that it might be required to implement both approaches anyway, since sometimes we have to "link" with existing code that is out of our control.

neild commented 5 years ago

@jhump by "the first case", you mean protoV1.Message and protoV2.Message are not the same type?

Let me lay out the options (as I see them) in a bit more detail. (Edit: Doing this as two separate posts as suggested by tandr@ to collect thumbs up/down on each.)

neild commented 5 years ago

Option 1: Two different `Message` types

This is the current state of the new package. The old package has:

type Message interface {
  Reset()
  String() string
  ProtoMessage()
}

And the new package has:

type Message interface{
  ProtoReflect() protoreflect.Message
}

Generated message structs satisfy both interfaces (for now, at least): They implement all four required methods. You can pass a newly generated message to either package.

Older generated messages don't have a ProtoReflect method, so you need to pass them to an adapter function of some form. This will look something like:

import (
  "google.golang.org/protobuf/proto"  // New package
  protoV1 "github.com/golang/protobuf/proto" // Old package
)

// This function works with a concrete message that hasn't been regenerated to
// support the new Message interface.
func A() {
  // ...
  m := &somelegacypb.Message{}
  proto.Unmarshal(b, proto.FromV1(m))
}

// This function accepts the old Message type.
func B(m protoV1.Message)  {
  // Get a reflective view of this V1 message.
  p := proto.FromV1(m).ProtoReflect()
  // ...
}

The FromV1 function will look something like:

func FromV1(m protoV1.Message) Message {
  if m2, ok := m.(Message); ok {
    // Underlying type implements the new Message interface.
    return m2
  }
  return nonTrivialTranslationLayerStuff(m)
}

Pros:

The new proto.Message interface is complete over the required behavior.
If/when use of the old package dies out, we have a cleaner and healthier ecosystem.

Cons:

Need to deal with two different Message types during the transitional period (which could be long).
No gradual upgrade path for packages with proto.Message in their exported API.

neild commented 5 years ago

Option 2: One `Message` type

We change the definition of Message in the new package to be identical to the old one.

type Message interface {
  Reset()
  String() string
  ProtoMessage()
}

This removes the need for conversion functions as used above; an existing type which implements proto.Message can be passed directly to the new package.

You now need to use a function to get a reflective view of a message:

func B(m proto.Message)  {
  // Get a reflective view of this message.
  p := proto.Reflect(m)
  // ...
}

The proto.Reflect function will look something like:

func Reflect(m proto.Message) protoreflect.Message {
  if p, ok := m.(interface {
    ProtoReflect() protoreflect.Message
  }); ok {
    return p.ProtoReflect()
  }
  return nonTrivialTranslationLayerStuff(m)
}

Pros:

Gentlest migration path. Just start using the new package.

Cons:

The Message type is not behaviorally complete and will never become so.
We will always need to generate Reset and ProtoMessage methods, leading to additional binary bloat.
Surprising behavior when working with types that implement proto.Message, do not implement the reflection APIs, and are not understood by the non-trivial translation layer.

jhump commented 5 years ago

@neild, @dsnet, if option 1 wins (two separate packages/APIs), does the existing api-v2 branch -- in its current form -- represent an almost-stable API for the new stuff? I've only been following along with development occasionally and haven't really tried using it yet. So I was curious to start porting a repo to the new stuff, to actually play with the new APIs and such. But I'm wondering, if I choose to tackle that now, how likely is the v2 API to change materially out from under me?

dsnet commented 5 years ago

We've been migrating a number of targets inside Google to use protobuf reflection and through the experience, it has helped us refine what the API should be. CL/175458 is an example of API changes informed by real usage. Within Google, we have the ability to also migrate all users since we use a monorepo. Unfortunately, we can't fix external users.

But I'm wondering, if I choose to tackle that now, how likely is the v2 API to change materially out from under me?

Unfortunately, likely. Fortunately, most of the changes are fairly mechanical changes. If it helps, we can track all breaking changes in one place with instructions on how to change.

robfig commented 5 years ago

Representing an external company with heavy protobuf usage, I'd prefer to pay the one-time conversion cost to gain a superior interface for the long haul. As we (mostly) also use a monorepo, I don't foresee the conversion work required to be very hard.

johanbrandhorst commented 5 years ago

I am mostly exposed to gRPC and open source use of protobuf and in my experience the use of the proto.Message interface is rare. I favour option 1.

spenczar commented 5 years ago

I prefer option 2, fairly strongly.

Lack of a gradual upgrade path is too painful. That's the sort of thing that slows teams way down and prevents any upgrade from occurring at all.

Gradual upgrades are especially important for the github.com/golang/protobuf project to care about because protoc-gen-go couples generated code to a snapshot of the runtime library. Widespread use of RPC systems built on generated code means that this coupling spans multiple teams: you're importing a generated client of one service, and providing a client to other services. Non-gradual upgrades force lockstep updates across an organization.

Statements like "when use of the old package dies out, we have a cleaner and healthier ecosystem" concern me. Usage of the old package will never completely die out. We'll always have references to the old package.

jhump commented 5 years ago

@spenczar, I think there may be a misunderstanding of what option 1 and 2 entail.

The lack of gradual upgrade path is for APIs that use proto.Message in their exported API. Uses of protobuf should be fine with option 1.

The real "lack of gradual upgrade path" means that packages that currently use proto.Message in their export API must do one of two things to support the new reflection functionality:

Provide a new major version (backwards incompatible) that changes the exported API to refer to the new Message interface instead of the old one. (And, of course, possibly reimplement some of that API to properly use the new interface/take advantage of new features.)
Provide extra API entry points that allow callers to supply new Message interface (in addition to maintaining old entry points that use old interface).

But old code can still link against and work just fine w/ new protobuf runtime and old generated code will still work.

@neild, did I describe that accurately?

puellanivis commented 5 years ago

While one should always have a way to smoothly transition from one way to another, I think such a transition has to have a definite limit in terms of support lifetime, otherwise, you end up supporting the old way forever, even when it makes your life miserable because everything has to work both ways now, and that becomes Just The Way Things Are.

Some people will never change until it breaks, no matter how trivial the change might be. And breaking things is not Always The Wrong Choice. :woman_shrugging:

I’ll point here to https://github.com/grpc/grpc-go/issues/711 which was a somewhat similar situation, where a choice in code generation would break people. It sat in “cannot change or we break people”, up to “we have a migration strategy” to “once go1.8 is end-of-life”. It took two and a half years to make what was on the surface a relatively simple change.

But this is a problem that is never going to be seen inside of Google, because protobufs are compiled fresh every build. The entire notion of checking in generated code is to me still kind of crazy. It’s pretty much just like checking in a binary blob.

spenczar commented 5 years ago

@jhump Hm, I'm not following. Here's how I understood things:

Under option 1, the (github.com/golang/protobuf/proto).Message interface is changed. That means old generated code no longer implements the (github.com/golang/protobuf/proto).Message interface.

Is that correct? I might have this wrong, particularly if this is an interface in a new github.com/golang/protobuf/proto/v2 package.

If old generated code no longer implements the proto.Message interface, then the struct values can't be passed into any APIs that accept proto.Message, so either the code needs to be regenerated, or those APIs need to change, or the code needs to be linked against an old version of github.com/golang/protobuf/proto. Is that correct?

If this is the case, I think we hit the lockstep problem.

jhump commented 5 years ago

@puellanivis, if that was in reply to my last comment, I wasn't suggesting the old APIs be supported forever. But they must be supported during some sort of transition period. For the second option I mentioned (having a package that provides API for both old and new interface), the idea is that the functions for the old interface would eventually be deprecated/removed.

As far as checking in generated code: not everyone has blaze :) It is idiomatic Go that one be able to go get ... to download and install a Go package or program, and the go tool does not attempt any other build steps (such as go generate ...). So checking in the generated code gives users the best experience. Otherwise, go get ... will fail to compile and users must then take extra steps to run make or go generate ... or whatever before retrying go install ....

@spenczar:

I might have this wrong, particularly if this is an interface in a new github.com/golang/protobuf/proto/v2 package.

Yes, it is a new interface. I think the suggestion is that the import path for the v2 package will be "google.golang.org/protobuf/proto". But, in the in-development api-v2 branch, it's similar to as you wrote: "github.com/golang/protobuf/v2/proto".

spenczar commented 5 years ago

@jhump Thanks. I agree, then, that my worry about lockstep upgrades does not apply; option 1 looks better to me too.

puellanivis commented 5 years ago

@jhump I’m well aware of the difficulties that lead to people checking in the generated code. Such is the world we live in. But I had completely composed my message before your comment came in and it was more of a generalized reply to @spenczar and expressing my preference against Option 2.

neild commented 5 years ago

@jhump

@neild, did I describe that accurately?

Yes, that's correct. Let's say today you have a package with an exported API like this:

package prototwiddle

import "github.com/golang/protobuf/proto"

// Twiddle fiddles with a message.
func Twiddle(m proto.Message) {}

If we redefine Message in the new package, then you can switch to the new proto API without changing your exported API with something like:

import (
  protoV1 "github.com/golang/protobuf/proto"
  "google.golang.org/protobuf/proto"
)

func Twiddle(m protoV1.Message) {
  twiddleV2(proto.FromV1(m))
}

func twiddleV2(m proto.Message) {
  // ...
}

You could make twiddleV2 an exported function to directly support either definition of Message, but you can't change the signature Twiddle without making a breaking API change (and presumably releasing a new major version).

neild commented 5 years ago

@spenczar

I might have this wrong, particularly if this is an interface in a new github.com/golang/protobuf/proto/v2 package.

Yes, the question is what the interface in the new package should be. (The final package name is going to be google.golang.org/protobuf/proto, FWIW.) The definition of "github.com/golang/protobuf/proto".Message can't change.

dsnet commented 5 years ago

For any adventurous people who are actually using v2, I've mentioned above that the API is not fully stable yet. If you want to be notified of any breaking changes, subscribe to #867.

dsnet commented 4 years ago

The google.golang.org/protobuf module has been released where it has a new definition of the Message interface that treats protobuf reflection as a first-class feature.

alandonovan commented 4 years ago

Congratulations on an excellent piece of work.

golang / protobuf