scalecube / scalecube-services

Microservices library - scalecube-services is a high throughput, low latency reactive microservices library built to scale. it features: API-Gateways, service-discovery, service-load-balancing, the architecture supports plug-and-play service communication modules and features. built to provide performance and low-latency real-time stream-processing
http://scalecube.github.io/
Apache License 2.0
610 stars 105 forks source link

Use an unpooled buffer for setup payload #816

Closed Sm0keySa1m0n closed 2 years ago

Sm0keySa1m0n commented 3 years ago

This fixes data corruption issues when reusing event loops for multiple service calls.

A demo of the bug can be found here: https://gist.github.com/Sm0keySa1m0n/f3396a8aa3d5e093d24571ffd5da9c69

You'll notice the byte read from the setup payload is not correct in the second service call. This was causing ClosedChannelExceptions on the client because the server was silently closing the connection after a StreamCorruptionException was thrown when trying to decode the connection setup payload. Therefore this PR will probably fix https://github.com/scalecube/scalecube-services/issues/743 and https://github.com/scalecube/scalecube-services/issues/697.

segabriel commented 3 years ago

It's not reproducible by the provided gist on my local environment (the last master commit + macOS). Could you provide versions of scalecube, netty, and rsocket for the provided example and OS? Also, it doesn't look like a solution. Need to clarify the reason and a possible spot where need to create an issue (netty, rsocket).

Sm0keySa1m0n commented 3 years ago

I'm using Java 15, Windows 10 and these dependencies:

implementation group: 'io.scalecube', name: 'scalecube-transport-netty', version: '2.6.9'
implementation group: 'io.scalecube', name: 'scalecube-services-transport-rsocket', version: '2.10.18'
implementation group: 'io.scalecube', name: 'scalecube-services-discovery', version: '2.10.18' 
implementation group: 'io.scalecube', name: 'scalecube-services', version: '2.10.18'
implementation group: 'io.scalecube', name: 'scalecube-services-bytebuf-codec', version: '2.10.18'

implementation group: 'io.rsocket', name: 'rsocket-core', version: '1.1.0'
implementation group: 'io.rsocket', name: 'rsocket-transport-netty', version: '1.1.0'
Sm0keySa1m0n commented 3 years ago

I'm going to assume it's an issue with rsocket version 1.1.0 because I've just noticed that ScaleCube is still on 1.0.4.

Sm0keySa1m0n commented 3 years ago

I have updated my example to remove ScaleCube completely, it's now only using RSocket 1.1.0. https://gist.github.com/Sm0keySa1m0n/bb33abcb32adf0ae9c26d6f640420905

segabriel commented 3 years ago

@Sm0keySa1m0n Finally, when I tried to use the old version of the rsocket 1.1.0 I see the difference. So my suggestion is to use the latest rsocket version 1.0.4 (https://github.com/rsocket/rsocket-java/releases, 1.0.4 > 1.1.0), due to it's compatible with the last reactor-core and netty versions that scalecube uses (https://projectreactor.io/docs, 2020.0.6). Or wait when the rsocket version 1.1.1 will be released.

Sm0keySa1m0n commented 3 years ago

That's a strange versioning scheme... I was under the impression that 1.1.0 was newer than 1.0.4.

Sm0keySa1m0n commented 3 years ago

I can confirm that switching to RSocket 1.0.4 fixes the problem, I'll leave this PR open just in case you want to keep it for supporting the potential use of 1.1.0.