opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.85k stars 1.83k forks source link

[RFC] Protobuf in OpenSearch #6844

Open saratvemulapalli opened 1 year ago

saratvemulapalli commented 1 year ago

Inspiration

Plugins are very tightly version coupled with OpenSearch https://github.com/opensearch-project/OpenSearch/issues/1707 and relaxing them to work for patch versions is still in the works.

While working on extensions #2447 we really wanted to support multiple versions (including major/minor/patch) of OpenSearch with one OpenSearch SDK[1].

Proposal

Exploring for opensource solutions, Protobuf[2] which is built by Google and is widely adopted for serializing/de-serializing and used as RPC. It was built out of the box to support forward and backward compatibility seamlessly.

With an initial experiment of integrating protobuf in OpenSearch/Extensions https://github.com/opensearch-project/opensearch-sdk-java/issues/414#issuecomment-1471010441, we see:

For extensions, protobuf solves a lot of problems but has a tiny overhead for serialization/de-serialization over existing OpenSearch's StreamInput StreamOutput

Next Steps

With the learnings we have seen in SDK/Extensions, there is more potential for Protobuf integration in OpenSearch and would like to propose offering Protobuf as a new type:

Adding in transport will enable communication between OpenSearch nodes to have significant benefits in performance and seamless versioning compatibility. @nknize already started making changes to enable this with restructuring XContent https://github.com/opensearch-project/OpenSearch/pull/6470

Additionally, having protobuf at Rest layer will unblock OpenSearch to support gRPC (if we choose this path).

FAQ

Q Is Protobuf higher performant? A. We moved 2 APIs a. Cat Nodes b. _search, both APIs with protobuf had atleast 20% better performance compared to native protocol, and we see linear improvements with increase in cluster size.

Q. What are the benchmark numbers for search ? A. See OpenSearch benchmark results for querying with Protobuf : https://github.com/opensearch-project/OpenSearch/issues/10684#issuecomment-1876077885

Q. What are the benchmark numbers for Cat Nodes (Operational APIs) A. See benchmarking results : https://github.com/opensearch-project/OpenSearch/issues/6844#issuecomment-1742250229

Q. Is Protobuf in OpenSearch necessary to support GRPC A. Protobuf works at transport layer, while GRPC is a layer 7 protocol. GRPC internally uses protobuf as transport which makes it a dependency. We presume there will be significant performance benefits with GRPC as data would be transmitted binary instead of JSON.

cc: @VachaShah @prudhvigodithi

[1] https://github.com/opensearch-project/opensearch-sdk-java [2] https://protobuf.dev/overview/ [3] https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/common/io/stream/ByteBufferStreamInput.java [4] https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/common/io/stream/Writeable.java

nknize commented 1 year ago

Super excited about this!!! It's non-invasive which is fantastic! Enables us to further refactor the transport layer for extensions to support this.

This has my full endorsement! Nice work!

reta commented 1 year ago

@saratvemulapalli thanks for the RFC, how beneficial is gRPC for extensions actually? I am not against it but I have doubts we have a compelling use case now for it:

The unknown part for me is the role of security plugin, does it mean it has to support gRPC as well to perform any meaningful checks in gRPC world?

[1] https://github.com/opensearch-project/opensearch-clients/issues/55

dblock commented 1 year ago

Assuming extensions get very chatty transport improvements will be increasingly useful, but I am reading this proposal as 1) make transport pluggable today in OpenSearch, and 2) implement gRPC as an option. I think this has the potential to significantly improve node-to-node communication today passing around all kinds of cluster state. Let's see by how much!

saratvemulapalli commented 1 year ago

thanks @reta for taking a look.

how beneficial is gRPC for extensions actually? I am not against it but I have doubts we have a compelling use case now for it:

I dont know how much gRPC will help extensions (yet, we'll learn more) but with protobuf I've listed down the benefits we see so far for extensions. Infact extensions will be similar to clients for Rest APIs as they use clients internally.

extensions are pretty much "do whatever you want", but our clients only support HTTP right now [1], we would force people to do gRPC and HTTP (REST actions) at the same time, looks like overhead?

We definitely don't want to force using gRPC, as an example with clients. It should be just another option which we will offer in addition to Json based Rest APIs. I am hoping if we could transmit information in binary (protobuf) format, it would help in performance. Obviously we have to write seamless serializers/de-serializers for each API to translate this data into Java objects for OpenSearch APIs to understand them.

Mostly with the RFC, I am looking for feedback to see if there is something fundamental I am missing vs give it a shot and lets get some numbers to decide if this worth chasing in the longer term.

nknize commented 1 year ago

I am reading this proposal as 1) make transport pluggable today in OpenSearch,

Transport is already pluggable today (e.g., plugins/transport-nio)

...and 2) implement gRPC as an option.

Sort of. We would implement protobuf as an option (just like CBOR, SMILE, YAML, and JSON).

The only difference here is that we would add an additional ProtobufStreamInput extends StreamInput, ProtobufStreamOutput extends StreamOutput concrete implementation (e.g., in o.common.xcontent.protobuf) that overrides the default StreamInput StreamOutput marshall/unmarshall logic w/ the Protobuf implementation and the PROTOBUF transport option would use this instead of the default StreamInput/StreamOutput.

This is why I've been pushing the XContent refactor. The next PR will refactor the StreamInput/StreamOutput classes into a library and the xcontent/protobuf implementation can provide its concrete implementation that uses its own serialize logic. It's an elegant approach that gets us even further toward modularizing massive amounts of code out of :server into separable libraries (thus moving closer toward jigsaw modularization).

I dont know how much gRPC will help extensions...

Exactly. Let's keep this simple and focus just on protobuf as an optional transport (that as a bonus supports versioning!) than expanding the aperture to say it's a silver bullet for extensions (progress not perfection).

In a followup (after vetting and going GA) we can discuss protobuf as the default transport format over JSON. I think it's clearly a direction worth considering.

reta commented 1 year ago

Thanks a lot @saratvemulapalli , I was a bit confused by gRPC mentions but @nknize clearly clarified that we are only talking about serialization mechanism (protobufs), and not the communication protocol change (gRPC, at least for now). Thanks.

saratvemulapalli commented 1 year ago

Thanks @reta. I probably took you off mentioning about gRPC (sorry about that) but mostly intended this issue for Protobuf and put in list of opportunities this will enable and one of them is gRPC (if choose to make it happen down the line). I've updated the RFC to clarify this.

Tagging @peternied @cwperks @wbeckler @seanneumann who had thoughts/feedback.

dbwiddis commented 1 year ago

I'm not quite as up-to-speed on gRPC and other options as the more experienced folk up above, but I will say I'm all for protobuf because:

peternied commented 1 year ago

Great proposal, I'm all for starting as an optional transport layer and we can figure out where it best benefits the project.

owaiskazi19 commented 1 year ago

All in for protobuf! Great proposal @saratvemulapalli! In the future it would be great to see JSON response with versioning support for extension points APIs using protobuf (No more of XContent!).

Bukhtawar commented 1 year ago

Thanks for the proposal. @nknize @reta @VachaShah Just catching up have we evaluated ION as an alternative. Do we think we can run into problems with large data. The one benefit I see with ION is that it doesn't mandate a data schema, it's optional

reta commented 1 year ago

@Bukhtawar I don't recall there was any evaluation being done

Bukhtawar commented 1 year ago

Another one that I guess should be considered in favour of low garbage https://github.com/OpenHFT/Chronicle-Wire

saratvemulapalli commented 1 year ago

Thanks @Bukhtawar for the feedback. We haven't really looked at other alternatives as mostly we started to solve seamless cross version support first and saw the opportunity with OpenSearch as a whole. We definitely understand with large data[1] protobuf is not recommended. @VachaShah is actively working on this and we'll benchmark couple of alternatives including ION. If you have other suggestions let us know.

[1] https://protobuf.dev/programming-guides/techniques/#large-data

reta commented 1 year ago

The ones I have seen quite often in place of protobuf, if it could help, are:

VachaShah commented 1 year ago

I did a POC (see draft PR #9097 for the POC code changes for changing the API request response de/serialization and between the nodes) with protobuf for _cat/nodes API and got the time per request for the protobuf variant in comparison to the original _cat/nodes API. The following improvements were noted for various clusters:

2 nodes cluster

Average improvement of 16.14%

image image (1)

5 nodes cluster

Average improvement of 18.11%

image (2) image (3)

10 nodes cluster

Average improvement of 21.35%

image (4) image (5)

15 nodes cluster

Average improvement of 31.39%

image (6) image (7)
dblock commented 1 year ago

This is significant. Can we ship this?

saratvemulapalli commented 1 year ago

Also another great data point from @dbwiddis and @dblock, while they were trying to write https://github.com/opensearch-project/opensearch-sdk-py, all extension interfaces were autogenerated in python and seamlessly could work over transport with OpenSearch de-serializing this data using Protobuf java.

VachaShah commented 1 year ago

This is significant. Can we ship this?

Thank you @dblock! I am working on getting the changes from the POC in #9097 to merge in the repo. @saratvemulapalli and I are also working on getting the numbers for APIs like search.

reta commented 1 year ago

Thank you @dblock! I am working on getting the changes from the POC in #9097 to merge in the repo.

@VachaShah the numbers are very convincing, one thing we should keep in mind (which you definitely know about) is how to support migration from 2.x to 3.x when some nodes will talk old protocol and new ones will use Protobuf (or in general, any new protocol). In continuation to this, it may not be feasible to migrate all transport actions to Protobuf in one shot so even in 3.x we would need to maintain this mix of transport protocols.

dblock commented 1 year ago

As a strategy, I would 1) support multiple protocols in 2.x in a way where we can migrate actions one-by-one, 2) rip out the existing transport protocol implementation in 3.0 and fully replace it with Protobuf. IMO, we only need an upgrade path in which a 2.x node can do just enough transport protocol to upgrade itself to protobuf and then never look back.

saratvemulapalli commented 1 year ago

@reta we might have a way to get inter-operable protocols as protobuf can write and readfrom bytesArray but it be could lots of manual effort to get all actions into the new protocol :/.

VachaShah commented 1 year ago

It would be worth to see what actions can benefit the most from protobuf and as Sarat mentioned, the 2 protocols can co-exist with each other with some effort so we can make the upgrade scenarios work.

austintlee commented 1 year ago

Is the current effort on this RFC being done on some feature branch that I can follow and take a look at?

VachaShah commented 1 year ago

Hi @austintlee, currently there is a draft PR #9097 with the POC and the changes from the draft PR would be PRed out incrementally.

ketanv3 commented 1 year ago

Great proposal! Do we know what's the reason behind this improvement? Is it due to reduced CPU time during serialization, or reduced network time due to difference in payload size?

macohen commented 1 year ago

Is there already commitment to protobuf without considering alternatives per @reta?

The ones I have seen quite often in place of protobuf, if it could help, are:

Avro/Thrift have been around for quite some time. I'd like to see some comparison of some of these.

dblock commented 1 year ago

@macohen I think the idea is that we 1) make the transport protocol pluggable, 2) ship protobuf support as an experimental feature with an upgrade path, 3) offer other protocols, 4) make the best one the default.

saratvemulapalli commented 11 months ago

Great proposal! Do we know what's the reason behind this improvement? Is it due to reduced CPU time during serialization, or reduced network time due to difference in payload size?

We believe most of the benefits are with data compression, we couldn't really analyze frame graphs due to multiple threads. Protobuf ended up streaming fewer bytes for the same payload and efficient in serializing/de-serializing data.

bbarani commented 9 months ago

@saratvemulapalli @VachaShah can you please confirm if this change can be included in 2.x without breaking existing API? Basically can this change be added in a backward compatible manner in 2.x line?

We are evaluating if this change requires 3.0 release or can be included in 2.x line so need your inputs.

saratvemulapalli commented 9 months ago

@bbarani we do not know yet. Once the changes are pushed to 3.x for experimental feature, we will look at backward compatibility. In theory we do see ways where native transport protocol can move to protobuf, but there are few unknowns with mixed cluster scenarios which we have to dive into.

For now I would rather say it would be a breaking change.

rursprung commented 6 months ago

this sounds very interesting! what i didn't quite gather from this issue is whether you plan on using this only on the transport layer or whether you also see this as an option for the public API in the future (e.g. i'm not sure whether #10684 is only for the distribution of the search requests to the nodes or also for consumers which want to trigger the search request)? especially if this is then hidden behind clients like opensearch-java that'd be nice as you'd get the performance improvement fully transparently :)

(just talking as a nerd here, i currently don't have performance problems caused by the REST/JSON overhead)