vitessio / vitess

Vitess is a database clustering system for horizontal scaling of MySQL.
http://vitess.io
Apache License 2.0
18.34k stars 2.08k forks source link

Switch to github.com/gogo/protobuf #3566

Closed derekperkins closed 6 years ago

derekperkins commented 6 years ago

From @msolo @msolo-dropbox (not sure which one to use) responding to @ hebo about gRPC overhead.

grpc cost is going to come from protobuf encode/decode. you'll need to swap in something like this https://github.com/gogo/protobuf - we did a bunch of work to make this fast with grpc for our internal universal RPC system. It elides reflection for code generation. The API is not drop in, but the bigger benefit is that it reduces the pointer count and reduces the scan time during GC

It has an impressive list of users, including other database projects: cockroach, tidb, dgraph and etcd, plus kubernetes, mesos, nats, and more. Benchmarks show it to be ~3x faster and with fewer allocations.

BenchmarkGoprotobufMarshal-8                 3000000           506 ns/op         312 B/op       4 allocs/op
BenchmarkGoprotobufUnmarshal-8               2000000           691 ns/op         432 B/op       9 allocs/op
BenchmarkGogoprotobufMarshal-8              10000000           152 ns/op          64 B/op       1 allocs/op
BenchmarkGogoprotobufUnmarshal-8            10000000           221 ns/op          96 B/op       3 allocs/op

From @sougou

I heard that proto3 made some big improvements internally at google. I don't know if those changes have been ported out yet. If they have, and it's still not as good, we can look at adopting gogo. The concern is the number of dependencies we introduce.

cc @demmer @acharis

derekperkins commented 6 years ago

This may be unnecessary. https://groups.google.com/forum/#!topic/golang-nuts/F5xFHTfwRnY

(post copied below for convenience)

Hello gophers, This is an announcement that we will be merging the dev branch of github.com/golang/protobuf into master on April 30th (approximately 3 months from now). This merge will introduce several significant changes:

A new table-driven implementation that is shown to be about 1.3x to 2.1x faster when tested inside Google. The preservation of unknown fields in proto3 messages. Validation that strings are valid UTF-8 as specified in the language guides. When tested inside Google, we discovered that these changes were more disruptive than expected. The cause of the issues are mostly due to additional fields added to generated messages by protoc-gen-go:

An XXX_NoUnkeyedLiteral field that the generator now creates to force users to use keyed literals (e.g., foopb.Message{Name: "Golang", Age: 8} as opposed to foopb.Message{"Golang", 8}) to ensure forward compatible usage of messages. An XXX_unrecognized field that is necessary for proto3 to always preserve unknown fields. This breaks users that assume comparability of proto3 messages. Since XXX_unrecognized is of type []byte, this means that proto3 messages cannot be used as map keys nor directly compared using the == operator. An XXX_sizecache field that is part of the internal implementation of the table-driven serializer. The presence of this field can break tests that use reflect.DeepEqual to compare some received messages with some golden message. It is recommended that proto.Equal be used instead for this situation. The semantic changes may also cause issues:

Strict validation of string fields as valid UTF-8. It is regrettable that the Go implementation has not enforced the encoding thus far, and that was an oversight on our part. However, in order for protobuf messages to be properly parsed by implementations in other languages, it is important to produce an error on the Go side to protect users from generating invalid messages. If your strings do not use UTF-8, then consider using the bytes type. Preservation of unknown fields in proto3. In nearly all use-cases, this change should not cause an issue. However, the preservation of unknown fields can cause issues if you relied upon proto3 to drop unknown fields (e.g., for security reasons to avoid leaking sensitive information). Note that the proto3 specification has previously said such behavior is unspecified. To explicitly drop unknown fields, users should use proto.DiscardUnknown as appropriate. Note that the Go Protocol Buffer compatibility agreement reserves the right to make changes to internal XXX fields or due to specification errors without violating backwards compatibility. We are still making an announcement in advanced so that appropriate user-side changes can be made beforehand so that the eventual merge causes as little disruption as possible.

We recommend that you try vendoring the the dev branch to check whether your code works properly with the upcoming Go protobuf changes. If you experience failures, we recommend that you perform the appropriate fix as mentioned in the points above. If you discover any issues with the dev branch, feel free to file an issue on the golang/protobuf tracker.

alainjobart commented 6 years ago

Ah thanks for the update. The proto library used internally inside google is the new one, with the added new fields, and performance is better. And it works too (for instance, we don't use proto messages as keys of maps anywhere any more, but we used to).

My preference would be to stick with the official version if we can, to avoid fragmentation...

dweitzman commented 6 years ago

Looks like golang/protobuf v1.1.0 is released now: https://github.com/golang/protobuf/releases

sougou commented 6 years ago

We're now upgraded.