Support gRPC - Githubissues

diegolovison commented 1 year ago

gRPC is a modern open source high performance Remote Procedure Call (RPC) framework that can run in any environment. It can efficiently connect services in and across data centers with pluggable support for load balancing, tracing, health checking and authentication. It is also applicable in last mile of distributed computing to connect devices, mobile applications and browsers to backend services.

franz1981 commented 1 year ago

Hi @diegolovison I've created https://github.com/franz1981/Hyperfoil/tree/grpc to start working on it.

The plans (right now, but we can discuss about it, given that I'm in an early PoC stage) are:

use https://www.baeldung.com/java-convert-json-protobuf#:~:text=We%20can%20convert%20JSON%20to,parse%20JSON%20to%20protobuf%20message. something like this to convert from the json (maybe with placeholders?) in the yaml hyperfoil config to the actual protobuf data
use https://vertx.io/docs/vertx-grpc/java/#_message_level_api_2 to send such as blobs (which means using Vertx client for it)
expose the hyperfoil event loops to allow vertx to configure it for the clients
implement all of this as a Hyperfoil plugin, similar to hotord: the reason is that semantically, despite gRPC uses HTTP 2, from a user perspective is a different thing with different testing scenarios (single-request, streams, duplex, see https://ghz.sh/)

Pros and cons of choosing Vertx:

pros: it supports all the mechanisms we need (it seems) and it's (Netty) event loop friendly and it's high-performance. If it prove to not be high-performance enough, than we will fix it (including Netty), which is a win-win for Red Hat products and community-wise
cons: it is GC intensive for the response (which will be fully decoded) if used as it is, but the reality is that even using Netty HTTP 2 is kind of GC intensive in few parts....

jesperpedersen commented 1 year ago

I say, start with vert.x and a plugin style integration

diegolovison commented 1 year ago

I believe before starting coding we could define the scenario for the plugin. Example:

name: grpc-example
grpc:
- uri: !concat [ "https://localhost:", !param PORT 8080 ]
usersPerSec: 1
duration: 5s
scenario:
- example:
  - proto |
    message Person {
      optional int32 id = 2;
      optional string name = 1;
      optional string email = 3;
    } 
  - randomInt: id <- ...
  - randomString: name <- ...
  - randomString: email <- ...
  - grpcRequest:
      type: serverStreamingRpc

franz1981 commented 1 year ago

@diegolovison given that I got just half day before going in PTO for another week, today I'll use that branch to write a quick hyperfoil extension, a nop one, just to get used of what it means, and how much control we have on using the Hyperfoil event loop. Feel free to propose the shape of data you know could be useful for users (if you have way to ask them, or want them to invite on this conversation, I'll be happy!). Re the proto definition, I loved the GHZ definition using json, but I am use they accept binary blogs and proto as well (according to the doc), so I will start with one of these, first. In term of testing scenario I expect the single request to be the first one to be implemented, too, to speed up experimentation with this.

franz1981 commented 1 year ago

Just an additional note: while reading PrepareHttpRequestStep/SendHttpRequestStep and comparing vs HotRodRequestStep is now clear why http implement its own HttpRequestContext: given that http requires first to have a connection available before sending a request, the preparation and actual sending of Http message need to track when there isn't any connection available to pick up the request (which happen in the PreparedHttpRequestStep). The API for HotRod, instead doesn't have that level of control and just allow to issue a request (an operation which always succeed, just we don't know when it's going to start and if a delay depends by a lack of some resource eg an available physical connection).

I see that adopting an higher level API like the one from VertX seems to be equally uninformative, but looking at the MessageLevel API twice...

Future<GrpcClientRequest<Buffer, Buffer>> requestFut = client.request(server);

requestFut.onSuccess(request -> {
  // ----------------------------------------------------> here we're ready to prepare/setup the request
  // Set the service name and the method to call
  request.serviceName(ServiceName.create("helloworld", "Greeter"));
  request.methodName("SayHello");

  // Send the protobuf request
  request.end(protoHello);

  // Handle the response
  Future<GrpcClientResponse<Buffer, Buffer>> responseFut = request.response();
  responseFut.onSuccess(response -> {
    // --------------------------------------------------> Here the response has already arrived with success
    response.handler(protoReply -> {
      // Handle the protobuf reply
      // ------------------------------------------------> Here the response is completed? 
    });
  });
});

In short, although we could track when a request is blocked and for how long before being being sent eg the blocked time before getting an available connection.

This won't prevent to flood the server with pending requests in case the configured send rate is just too high (assuming the Sessions to be enough) nor grant if adding more connections would help, differently from Http, but at least is more informative than HotRod.

NOTE to investigate

It would be great, similarly to HTTP to separate the availability to send any data, in order to avoid enqueuing a gRPC request while none is ready to send it and maybe the vertx API have some mechanism (checking its write queue) to advertize when there's room to send new data: will keep the same level of information that enqueuing the request, but would save issuing any till we're ready to go.

Another point of investigation is the threading model: we execute steps from the hyperfoil event loops and, ideally, we would like to issue requests from the "right" event loop thread already: we would like each Session here to have its own thread local gRPC vetrx client instances which interact just with the right "partition" of thread local (to the I/O threads) connections established by Vertx for gRPC.

vietj commented 1 year ago

As far as I know @franz1981 there are a few mechanisms that control the client requests for HTTP/2 which are

the maximum number of concurrent streams (can be obtained from HttpConnection#settings()
the connection window size (which currently is not exposed I think)
the TCP channel writability (not exposed too)

A few notes:

the maximum number of concurrent streams is defined by the server
the connection window size can be increased beyond the initial value to increase the message transfer bandwidth

diegolovison commented 1 year ago

What maven plugin can I use to compile the proto to java?

syntax = "proto3";

option java_multiple_files = true;
option java_package = "io.grpc.examples.helloworld";
option java_outer_classname = "HelloWorldProto";
option objc_class_prefix = "HLW";

package helloworld;

// The greeting service definition.
service Greeter {
  // Sends a greeting
  rpc SayHello (HelloRequest) returns (HelloReply) {}
}

// The request message containing the user's name.
message HelloRequest {
  string name = 1;
}

// The response message containing the greetings
message HelloReply {
  string message = 1;
}

franz1981 commented 11 months ago

I've proceeded into the investigation to avoid any code generation and:

by using https://github.com/square/protoparser is possible to extract the relevant information directly from the proto definition and populate any data, but we need a complete mapping of all the gRPC supported types (including nesting etc etc) and there isn't anything implemented around AND (which is a pain point), https://github.com/square/protoparser seems to not be actively maintained from long time
by using https://github.com/square/wire/ in a way similar to what EnterpriseDB does for https://github.com/EnterpriseDB/apicurio-registry/tree/970a2ae8fcb35beef7e484c4f5475861e921696f/utils/protobuf-schema-utilities/src/main/java/io/apicurio/registry/utils/protobuf/schema but we likely need to modify few things to make it work

In both cases a relevant effort is required before actually be able to integrate with the Vertx message-level API, which seems the easier part, at this point.

I'm working on 2 PoC using both, but I would like to (re)use what Apicuro registry is doing, given that is actively maintained by Red Hat.

franz1981 commented 11 months ago

@vietj @diegolovison To avoid maintaining the parsing of all the protobuf types and the mapping with Java types I've used apicuro registry utils and produced this simple PoC: https://github.com/franz1981/Hyperfoil/commit/a439d5488d320a129929c78049237edaf36325df

It allow to parse the services and other relevant data which can be used to consume the vertx service.

Obviously it is not optimized and it's using few intermediate classes to encode/decode (including Gson and Vertx Json types altogether!), which is something we would like to avoid, but:

having a protobuf parsed descriptor allow us to perform it ourselves and skip using DynamicMessage or JsonFormat-like classes and build our own
it's using projects which are actively maintained eg https://github.com/square/wire/

At worse, it means we will need to provide patches to improve performance or the encoding/decoding additional bits, if required.

diegolovison commented 11 months ago

Is the goal to inform a proto and then requests in JSON format?

franz1981 commented 11 months ago

The proto part, yes, if possible, to grant a proper encoding/decoding of messages (think about a req/res chain where the next req depends on some condition happening on the Res outcome), while the Json input is just to match what GHz can do (and is natively supported by protobuf, although with ugly performance, in case we need to prepare it at runtime instead of precomputing), but I am opened to other formats which match other types used in hyperfoil. Afaik your team is supposed to be the main and first user of this feature, so correct me if I am wrong, please

franz1981 commented 11 months ago

After speaking with @carlesarnal I think we're on the right track here: in apicuro registry

https://github.com/Apicurio/apicurio-registry/blob/3f8c7b80d4def0c524e4e2640aa617a4ed14f702/serdes/protobuf-serde/src/main/java/io/apicurio/registry/serde/protobuf/ProtobufSchemaParser.java#L61

( and https://github.com/Apicurio/apicurio-registry/blob/3f8c7b80d4def0c524e4e2640aa617a4ed14f702/serdes/protobuf-serde/src/main/java/io/apicurio/registry/serde/protobuf/ProtobufKafkaDeserializer.java#L182)

is responsible of the decoding path, but they have a slightly different use case, because they decode on the fly the proto file amd search just the first message description into it, without parsing the other parts eg service/reply etc etc, which is something I expect we will do off-line instead, before the actual load generation part start, ie:

read the benchmark definition
parse the provided protos caching all the necessary descriptions to correlate the service with request reply and eventually generators ie something which generate the requests (which can be fixed or dependent by other transformations)
run the load and reuse the existing bits to encode requests and decode replies (or NOT decode them? yet to decide) if there is no predicate on the type of reply

The encoding part of apicuro registry instead is on https://github.com/Apicurio/apicurio-registry/blob/3f8c7b80d4def0c524e4e2640aa617a4ed14f702/serdes/protobuf-serde/src/main/java/io/apicurio/registry/serde/protobuf/ProtobufKafkaSerializer.java#L126

franz1981 commented 11 months ago

I'm searching what alternatives we have to skip using an intermediate Json parsing (unnecessary) and found

https://github.com/google/gson/blob/main/proto/src/main/java/com/google/gson/protobuf/ProtoTypeAdapter.java#L265

but it seems too naive and simple. especially if compared to:

https://github.com/protocolbuffers/protobuf/blob/main/java/util/src/main/java/com/google/protobuf/util/JsonFormat.java#L1455

which, instead, is the entry point for the same functionality :/ For now, if we stick with predefined encoded messages, the problem won't exist (but just a slower startup, really), but if we introduce requests which fields requires to be populated at runtime, it will be troublesome and will likely require us to implement the JsonParser approach ourself and/or contribute to expose the existing one in protobuf-java.

franz1981 commented 11 months ago

I've proceeded and written a test using the apicurio registry utils to parse a graph of proto files with different packages and imports and...:/ it seems it hasn't worked fine, eg:

producerId.proto:

syntax = "proto3";
package mypackage0;

message ProducerId {
  string name = 1;
  string version = 2;
}

producer.proto:

syntax = "proto3";
import "mypackage0/producerId.proto";
package mypackage1;
message Producer {
  mypackage0.ProducerId id = 1;
  string name = 2;
}

franz1981 commented 11 months ago

I've opened a discussion about this at https://github.com/Apicurio/apicurio-registry/discussions/3819, because right now it seems I cannot have the same feature set of https://ghz.sh/docs/example_config without modifying the existing apicurio-registry-protobuf-schema-utilities module.

In order to proceed on the hyperfoil front, I'll give up on proto deps imports, and start building the plugin part.

franz1981 commented 10 months ago

The current status of the gRPC branch is:

first hello world test with uni type just work, requires streaming ones and bi
no statistics yet implemented
requires adding logic to handle reply outcomes
security and compression still missing
http 2 concurrency still missing: for footprint reasons we create a single HttpClient instance for each authority, leveraging on Vertx connection pool, which we don't have much control: we could create a separate HttpClient instance for each connection, which will make easier for us to both pick the connection we want, as we do for http, and track, per connection, the used streams/concurrency level; if we leave the current implementation as it is we would just use a single "available connections" counter with much less control to distribute load across connections, given that the Vertx pool seems to work in LIFO, while our http one, in FIFO.
requires optimizing message memory usage as well as encoding (see https://codeburst.io/using-dynamic-messages-in-protocol-buffers-in-scala-9fda4f0efcb3, why) - we need to pool encoding buffers and/or allow to cache/precompute the request blobs "somehow"
requires optimizing memory footprint of configuration data (contains proto desc blobs in, which shouldn't be needed at runtime)

Currently GHz doesn't allow to change request data on the fly and is using dynamic messages as well (the medium article report them to not be brilliant, perf wise, but I would add jmh module in Hyperfoil to verify it), which is the reason why everything is constant and precomputed, apart from the metadata, which are simple key/value data and can change, given that doesn't depends by the proto definition. The easier way is to have the same limitations in Hyperfoil and always force precomputing requests buffers and just perform a perfect matching for replies, extracting metadata and/or status (for success/failure detection), but avoiding fine-grain decoding, unless configured (a debug mode feature for troubleshooting, not for actual benchmarking).

In case this limitation is too annoying for users, I would explore using java compiler to generate code during the benchmark parsing, but is really the last resort (because I will need to generate the Json to builder encoding as well!).

franz1981 commented 10 months ago

Adding https://github.com/eclipse-vertx/vert.x/pull/4933: this is related to

http 2 concurrency still missing: for footprint reasons we create a single HttpClient instance for each authority, leveraging on Vertx connection pool, which we don't have much control: we could create a separate HttpClient instance for each connection, which will make easier for us to both pick the connection we want, as we do for http, and track, per connection, the used streams/concurrency level; if we leave the current implementation as it is we would just use a single "available connections" counter with much less control to distribute load across connections, given that the Vertx pool seems to work in LIFO, while our http one, in FIFO.

franz1981 commented 7 months ago

I'm adding this https://github.com/franz1981/modelmesh/blob/80c13b88de479651d298911dd276ee7d4638d791/src/main/java/com/ibm/watson/modelmesh/GrpcSupport.java to the list of ideas on how to improve performance in case of mutable and different models

Hyperfoil / Hyperfoil

Support gRPC #281