Closed Sm0keySa1m0n closed 2 years ago
It's not reproducible by the provided gist on my local environment (the last master commit + macOS). Could you provide versions of scalecube, netty, and rsocket for the provided example and OS? Also, it doesn't look like a solution. Need to clarify the reason and a possible spot where need to create an issue (netty, rsocket).
I'm using Java 15, Windows 10 and these dependencies:
implementation group: 'io.scalecube', name: 'scalecube-transport-netty', version: '2.6.9'
implementation group: 'io.scalecube', name: 'scalecube-services-transport-rsocket', version: '2.10.18'
implementation group: 'io.scalecube', name: 'scalecube-services-discovery', version: '2.10.18'
implementation group: 'io.scalecube', name: 'scalecube-services', version: '2.10.18'
implementation group: 'io.scalecube', name: 'scalecube-services-bytebuf-codec', version: '2.10.18'
implementation group: 'io.rsocket', name: 'rsocket-core', version: '1.1.0'
implementation group: 'io.rsocket', name: 'rsocket-transport-netty', version: '1.1.0'
I'm going to assume it's an issue with rsocket version 1.1.0 because I've just noticed that ScaleCube is still on 1.0.4.
I have updated my example to remove ScaleCube completely, it's now only using RSocket 1.1.0. https://gist.github.com/Sm0keySa1m0n/bb33abcb32adf0ae9c26d6f640420905
@Sm0keySa1m0n Finally, when I tried to use the old version of the rsocket 1.1.0 I see the difference. So my suggestion is to use the latest rsocket version 1.0.4 (https://github.com/rsocket/rsocket-java/releases, 1.0.4 > 1.1.0), due to it's compatible with the last reactor-core and netty versions that scalecube uses (https://projectreactor.io/docs, 2020.0.6). Or wait when the rsocket version 1.1.1 will be released.
That's a strange versioning scheme... I was under the impression that 1.1.0 was newer than 1.0.4.
I can confirm that switching to RSocket 1.0.4 fixes the problem, I'll leave this PR open just in case you want to keep it for supporting the potential use of 1.1.0.
This fixes data corruption issues when reusing event loops for multiple service calls.
A demo of the bug can be found here: https://gist.github.com/Sm0keySa1m0n/f3396a8aa3d5e093d24571ffd5da9c69
You'll notice the byte read from the setup payload is not correct in the second service call. This was causing
ClosedChannelException
s on the client because the server was silently closing the connection after aStreamCorruptionException
was thrown when trying to decode the connection setup payload. Therefore this PR will probably fix https://github.com/scalecube/scalecube-services/issues/743 and https://github.com/scalecube/scalecube-services/issues/697.