grpc / grpc-node

gRPC for Node.js
https://grpc.io
Apache License 2.0
4.43k stars 640 forks source link

Failed to parse server response when I try to read a stream #1022

Open paambaati opened 5 years ago

paambaati commented 5 years ago

Problem description

I'm using generated static bindings from https://github.com/liftbridge-io/liftbridge-grpc/blob/0877a0cacb7a721e2e19280cff126dc2ed51912d/api.proto to create a Liftbridge Node.js client that will create a stream and publish some messages and subscribe to them (Liftbridge is a kind of lightweight Kafka), and I run into an Error: 13 INTERNAL: Failed to parse server response error consistently whenever I try to consume the Subscribe stream.

Reproduction steps

  1. docker run -p 4222:4222 -ti nats:latest --debug --trace in a window.
  2. go get github.com/liftbridge-io/go-liftbridge and then $GOPATH/bin/liftbridge --raft-bootstrap-seed --nats-servers nats://localhost:4222 --level debug in another window.
  3. Clone my repo from https://github.com/paambaati/node-liftbridge.git in yet another window.
    1. yarn install or npm install
    2. yarn run debug or npm run debug to run the debug script that reproduces this issue.

The debug script attempts to create a new stream, then publish a few messages, then subscribes to the same stream (subject) and then publishes a few more messages.

Expected output — Each published message should be printed to console (see relevant lines).

Actual output —

Error: 13 INTERNAL: Failed to parse server response
    at Object.exports.createStatusError (~/node-liftbridge/node_modules/grpc/src/common.js:91:15)
    at ClientReadableStream._emitStatusIfDone (~/node-liftbridge/node_modules/grpc/src/client.js:233:26)
    at ClientReadableStream._receiveStatus (~/node-liftbridge/node_modules/grpc/src/client.js:211:8)
    at Object.onReceiveStatus (~/node-liftbridge/node_modules/grpc/src/client_interceptors.js:1272:15)
    at InterceptingListener._callNext (~/node-liftbridge/node_modules/grpc/src/client_interceptors.js:568:42)
    at InterceptingListener.onReceiveStatus (~/node-liftbridge/node_modules/grpc/src/client_interceptors.js:618:8)
    at ~/node-liftbridge/node_modules/grpc/src/client_interceptors.js:1029:24 {
  code: 13,
  metadata: undefined,
  details: 'Failed to parse server response'
}

Environment

Additional context

  1. The static bindings were generated by a script - see relevant command here. The bindings are checked-in at grpc/generated.

  2. The Go version of the client I'm developing (see go-liftbridge) can in fact Subscribe to the messages Published by my Node.js debug script and print all the Messages correctly! I used the example subscribe program at https://github.com/liftbridge-io/go-liftbridge/blob/master/example/lift-sub/subscribe.go to verify this 😮

  3. I ran a full verbose trace of my debug script, and here's the output — https://gist.github.com/paambaati/7884b119eee47fafa436f74db8b59edc. I've padded the debug script's stdout with a lot of new lines so it is a little easier to separate them from the GRPC traces.

paambaati commented 5 years ago

FYI, adding [jstype=JS_STRING] to the int64 fields has no effect.

murgatroid99 commented 5 years ago

That error message essentially indicates that the incoming binary data could not be parsed as a valid message of the expected response message type. Are you sure that the .proto file you are using is consistent with the one that the service was defined with?

paambaati commented 5 years ago

Are you sure that the .proto file you are using is consistent with the one that the service was defined with?

@murgatroid99 I am, yes. And FWIW, the other implementations of the same .proto file (https://github.com/liftbridge-io/go-liftbridge in Go and https://github.com/dgzlopes/python-liftbridge in Python) work fine.

What other angles could we look at this from?

murgatroid99 commented 5 years ago

The next useful step is to add more logging. Those errors are handled here, here, and here, and logging the original error messages will give more information (only one of those will be triggering this specific issue, but I'm not sure which one). I can add some logging in an upcoming release, and in the meantime you could add logging in your local copy of the library to get the information faster.

paambaati commented 5 years ago

@murgatroid99 So the error is from here.

When I stringify response.read, I get —

KEY-ef33408c3ce1ecbfb2e4!VALUE-ok-KEY-ef33408c3ce1ecbfb2e4 �����;�*test9:
subjecttest9:
reply

Dump of the buffer direct to a file — output.txt

Screenshot of the file opened in SublimeText —

Screenshot 2019-09-10 at 1 33 59 PM

KEY-ef33408c3ce1ecbfb2e4 is Message.key, and VALUE-ok-KEY-ef33408c3ce1ecbfb2e4 is Message.value.

The Go client can Subscribe and read the Message without any problems.

paambaati commented 5 years ago

@murgatroid99 There's more weird stuff happening.

When I Publish this Message

{
  offset: 0,
  key: 'S0VZLWFlZmNkYWFhNThhMDlhNmViYWYy', // b64 version of 'KEY-aefcdaaa58a09a6ebaf2'
  value: 'VkFMVUUtb2s=', // b64 version of 'VALUE-ok'
  timestamp: 1568100468657000000,
  subject: 'test11',
  reply: '',
  headersMap: [],
  ackinbox: 'ack.test11',
  correlationid: 'c34c6cfa-a5c6-410c-8fb4-459b6f202970',
  ackpolicy: 0
}

this is the Message I see on the Go Subscribe side —

key:"KEY-aefcdaaa58a09a6ebaf2" value:"VALUE-ok" timestamp:1568100468661514000 subject:"test11" headers:<key:"reply" value:"" > headers:<key:"subject" value:"test11" >

You'll notice that the ackinbox and correlationid properties are also not set on the payload.

paambaati commented 4 years ago

@murgatroid99 By printing the deserialize error trace, I was able to trace the deserialization down to the headers map.

Full stack trace —

Error [AssertionError]: Assertion failed
    at new goog.asserts.AssertionError ~/node-liftbridge/node_modules/google-protobuf/google-protobuf.js:79:876)
    at Object.goog.asserts.doAssertFailure_ ~/node-liftbridge/node_modules/google-protobuf/google-protobuf.js:80:257)
    at Object.goog.asserts.assert [as assert] ~/node-liftbridge/node_modules/google-protobuf/google-protobuf.js:81:83)
    at Function.jspb.Map.deserializeBinary ~/node-liftbridge/node_modules/google-protobuf/google-protobuf.js:275:277)
    at ~/node-liftbridge/grpc/generated/api_pb.js:2259:18
    at jspb.BinaryReader.readMessage ~/node-liftbridge/node_modules/google-protobuf/google-protobuf.js:249:329)
    at Function.proto.proto.Message.deserializeBinaryFromReader ~/node-liftbridge/grpc/generated/api_pb.js:2258:14)
    at Function.proto.proto.Message.deserializeBinary ~/node-liftbridge/grpc/generated/api_pb.js:2214:30)
    at deserialize_proto_Message ~/node-liftbridge/grpc/generated/api_grpc_pb.js:59:25)
    at ~/node-liftbridge/node_modules/grpc/src/common.js:38:12
    at ~/node-liftbridge/node_modules/grpc/src/client_interceptors.js:689:22 {
  message: 'Assertion failed',
  reportErrorToServer: true,
  messagePattern: 'Assertion failed'
}

Relevant line from generated _pb file — https://github.com/paambaati/node-liftbridge/blob/dd64cf136c265bca798c6bcff4121a5398fafc99/grpc/generated/api_pb.js#L2259

Relevant proto definition of headers map — https://github.com/liftbridge-io/liftbridge-grpc/blob/0877a0cacb7a721e2e19280cff126dc2ed51912d/api.proto#L100

I'm not sure where to go from here. Any help would be appreciated.

Some observations

  1. After the headers field, no other field is set.
  2. The generated _pb file only exposes Message.getHeadersMap() and Message.clearHeadersMap(); is there no addHeadersMap() or setHeadersMap()? Is this by design? How does one set headers?
paambaati commented 4 years ago

I was finally able to trace it to the root cause! wipes sweat from forehead

The problem is when the headers map has an undefined value. The Go client reads it as an empty string (headers:<key:"reply" value:"" >) while grpc-node (or protobuf?) reads it as undefined, and hence the assertion error.

@murgatroid99 Is this a bug, or is this something that needs to be fixed/reconciled on the server-side? I can gladly provide more information/repros for this.

murgatroid99 commented 4 years ago

OK, it looks like there are multiple issues here.

The fact that you are seeing different messages in your subscribe responses than you are publishing looks like a server bug, or at least unexpected server behavior. Since the Go client is seeing it too, it's very unlikely to be a client bug.

The failure to process a map entry with an absent value looks like a protobuf bug. You should file an issue with the official protobuf repository. We just republish protoc. You should also go there with questions about the generated _pb.js files.

paambaati commented 4 years ago

@murgatroid99 Thanks for the pointers, I've further narrowed down the issue to a missing argument in the generated _pb.js file for deserializing map fields - see protocolbuffers/protobuf-javascript#43.

If you're familiar with the protobuf codebase, would you mind helping me find the codemod that generates the _pb.js file? I'd love to send a PR to fix this.

murgatroid99 commented 4 years ago

Sorry, I'm not that familiar with the protobuf codebase.

paambaati commented 4 years ago

https://github.com/protobufjs/protobuf.js/pull/1348 should fix this.

Josh-Rosen commented 1 year ago

That error message essentially indicates that the incoming binary data could not be parsed as a valid message of the expected response message type. Are you sure that the .proto file you are using is consistent with the one that the service was defined with?

This response helped me.

I was using the BloomRPC tool as a UI to send gRPC requests to my server. I updated my protos but didn't update the client Proto in the BloomRPC app. Updated and my requests are working again.