Closed greg007r closed 3 years ago
@greg007r Instead of wss://stream.binance.com:9443/ws
can you try with wss://echo.websocket.org
or similar server that just echos the incoming data e.g. https://github.com/reactor/reactor-netty/blob/7608571fb944a173481828b077d25776e08dd0eb/reactor-netty-examples/src/main/java/reactor/netty/examples/documentation/http/server/routing/Application.java#L34-L35
Also can you take a tcp dump (for example with Wireshark) in order to see the peer that closes the connection.
@greg007r as of Binance Websocket API docs you have to send PONG message back on every incoming PING. If it is not done, Binance closes the connection prematurely.
Also, it feels like this is a question that would be better suited to Gitter or Stack Overflow. As mentioned in README, we prefer to use GitHub issues only for bugs and enhancements. Feel free to update this issue with a link to the re-posted question (so that other people can find it) or add some more details if you feel this is a genuine bug.
Thank you for your feedback, I already investigated the PING/PONG frames that should be managed transparently by the ReactorNettyWebSocketClient having wsclient.setHandlePing(Boolean.FALSE)
configured.
So, as far as I know, I should not take care of this. I don't think this is the root cause. If I choose another service having less payload over time the connection is kept 10 min instead of 3 min.
I will give a try on the echo websocket but the message is not continuously delivered as it is for price ticker without any further message sent by the client.
@greg007r I would suggest doing a TCP dump to see who is closing the connection.
Also, I will try to replace EmitterProcess with DirectProcessor just to ensure that the problem is not related to backpressure
I replaced EmitterProcessor by DirectProcessor as explained and did a wireshark TCP dump. I don't understand why I'm not facing the same issue using javascript WebSocket or plain java 14 java.net.http.WebSocket Those two websocket client implementations runs perfectly for hours ... `<!DOCTYPE HTML>
`Also, I adapted the way I initialize the ReactorNettyWebSocketClient using the new WebsocketSpec Builder to be sure the ping handling is properly initialized :
WebsocketSpec.Builder builder = WebsocketClientSpec.builder() .handlePing(false) .maxFramePayloadLength(Integer.MAX_VALUE); ReactorNettyWebSocketClient wsclient = new ReactorNettyWebSocketClient(httpClient, builder);
@greg007r Can you try the code below. I tried to do the example with Reactor Netty only.
public static void main(String[] args) throws InterruptedException {
CountDownLatch latch = new CountDownLatch(1);
EmitterProcessor<String> output = EmitterProcessor.create();
Mono<Void> execMono =
HttpClient.create()
.websocket(WebsocketClientSpec.builder().maxFramePayloadLength(Integer.MAX_VALUE).build())
.uri(URI.create("wss://stream.binance.com:9443/ws"))
.handle((in, out) -> out.sendObject(Flux.just(new TextWebSocketFrame("{\"method\": \"SUBSCRIBE\",\"params\":[\"!ticker@arr\"],\"id\": 1}")))
.then(in.receive()
.doOnCancel(() -> System.out.println("A cancelled"))
.doOnComplete(() -> System.out.println("A completed"))
.doOnTerminate(() -> System.out.println("A terminated"))
.map(x -> "evt")
.log("TRACE")
.subscribeWith(output)
.then()))
.then();
output.doOnCancel(() -> System.out.println("B cancelled"))
.doOnComplete(() -> System.out.println("B completed"))
.doOnTerminate(() -> System.out.println("B terminated"))
.doOnSubscribe(s -> execMono
.doOnCancel(() -> System.out.println("C cancelled"))
.doOnSuccess(x -> System.out.println("C success"))
.doOnTerminate(() -> System.out.println("C terminated"))
.subscribe())
.subscribe();
latch.await();
}
@violetagg I tested and got the channel closed after 2min
11:44:06.217 [reactor-http-epoll-2] DEBUG reactor.netty.resources.PooledConnectionProvider - [id:17cab690-1, L:/192.168.1.5:50178 ! R:stream.binance.com/52.193.213.21:9443] Channel closed, now: 0 active connections, 0 inactive connections and 0 pending acquire requests.
11:44:06.218 [reactor-http-epoll-2] DEBUG reactor.netty.ReactorNetty - [id:17cab690-1, L:/192.168.1.5:50178 ! R:stream.binance.com/52.193.213.21:9443] Non Removed handler: reactor.left.httpAggregator, context: null, pipeline: DefaultChannelPipeline{(reactor.left.sslHandler = io.netty.handler.ssl.SslHandler), (ws-decoder = io.netty.handler.codec.http.websocketx.WebSocket13FrameDecoder), (ws-encoder = io.netty.handler.codec.http.websocketx.WebSocket13FrameEncoder), (reactor.right.reactiveBridge = reactor.netty.channel.ChannelOperationsHandler)}
A completed
A terminated
11:44:06.218 [reactor-http-epoll-2] INFO TRACE - onComplete()
B completed
B terminated
C success
C terminated
11:44:06.219 [reactor-http-epoll-2] DEBUG reactor.netty.resources.DefaultPooledConnectionProvider - [id:17cab690, L:/192.168.1.5:50178 ! R:stream.binance.com/52.193.213.21:9443] onStateChange(ws{uri=/ws, connection=PooledConnection{channel=[id: 0x17cab690, L:/192.168.1.5:50178 ! R:stream.binance.com/52.193.213.21:9443]}}, [response_completed])
11:44:06.219 [reactor-http-epoll-2] DEBUG reactor.netty.resources.DefaultPooledConnectionProvider - [id:17cab690, L:/192.168.1.5:50178 ! R:stream.binance.com/52.193.213.21:9443] onStateChange(ws{uri=/ws, connection=PooledConnection{channel=[id: 0x17cab690, L:/192.168.1.5:50178 ! R:stream.binance.com/52.193.213.21:9443]}}, [disconnecting])
I tested also a third websocket implementation in Python 2.3.7 with binance-python, it is also working fine for several hours ... Here is the sample code :
import asyncio
from binance import AsyncClient, BinanceSocketManager
from datetime import datetime
async def main():
client = await AsyncClient.create()
bm = BinanceSocketManager(client)
# start any sockets here, i.e a trade socket
ts = bm.multiplex_socket(['!ticker@arr'])
# then start receiving messages
async with ts as tscm:
while True:
res = await tscm.recv()
print("evt ", datetime.now())
await client.close_connection()
if __name__ == "__main__":
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
I'm really disappointed because I implemented my project in a async way end to end with webflux. As I have 3 different implementations (js, native java, python) working for several hours, I'm more and more convinced that I'm facing a bug from netty reactor.
I try to isolate the issue as much as possible ... all the sample codes I provided are easily replayable. I will also test with a different provider such as Jetty and then start to debug from the sources ... I have no other option
@simonbasle Can you recommend the best way for avoiding EmitterProcessor here:
EmitterProcessor<String> output = EmitterProcessor.create();
Mono<Void> execMono = wsclient.execute(URI.create("wss://stream.binance.com:9443/ws"),
session -> session.send(Flux.just(session.textMessage("{\"method\": \"SUBSCRIBE\",\"params\":[\"!ticker@arr\"],\"id\": 1}")))
.thenMany(session
.receive()
.doOnCancel(() -> System.out.println("A cancelled"))
.doOnComplete(() -> System.out.println("A completed"))
.doOnTerminate(() -> System.out.println("A terminated"))
.map(x -> "evt")
.log("TRACE")
.subscribeWith(output).then())
@violetagg @greg007r I don't understand the intent behind this EmitterProcessor
, other than complicating the code...
@greg007r you have it subscribe to monoExec
, then subscribe to the processor, THEN when said processor is subscribed to you subscribe to monoExec
again in a doOnSubscribe
?? This makes absolutely no sense to me, except for seeking complication for the sake of complexity...
My bad ... you are right, so now the issue is reproducible with this simple code :
public class SocketTest {
public static void main(String[] args) throws InterruptedException {
CountDownLatch latch = new CountDownLatch(1);
new ReactorNettyWebSocketClient(HttpClient.create(),
WebsocketClientSpec.builder()
.handlePing(false)
.maxFramePayloadLength(Integer.MAX_VALUE))
.execute(URI.create("wss://stream.binance.com:9443/ws"),
session -> session.send(Flux.just(session.textMessage("{\"method\": \"SUBSCRIBE\",\"params\":[\"!ticker@arr\"],\"id\": 1}")))
.thenMany(session
.receive()
.doOnCancel(() -> System.out.println("A cancelled"))
.doOnComplete(() -> System.out.println("A completed"))
.doOnTerminate(() -> System.out.println("A terminated"))
.map(x -> "evt")
.log("TRACE").then())
.then()).subscribe();
latch.await();
}
}
@greg007r Can you also specify your Java version and vendor because somehow I cannot reproduce this ... neither on Mac OS nor on Ubuntu
@violetagg Do you mean that the websocket keeps running after 10 min for you ? This is my linux mint setup : Linux 4.15.0-153-generic #160-Ubuntu x86_64 x86_64 x86_64 GNU/Linux openjdk version "14" 2020-03-17 OpenJDK Runtime Environment AdoptOpenJDK (build 14+36) OpenJDK 64-Bit Server VM AdoptOpenJDK (build 14+36, mixed mode, sharing)
@violetagg I did recompile and execute on jdk11 ... issue still there
@greg007r I need to find a way to reproduce this ....
Currently I cannot reproduce it on
Ubuntu: Linux 5.8.0-55-generic #62~20.04.1-Ubuntu SMP Wed Jun 2 08:55:04 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Java: adopt-openjdk-14.0.2
and adopt-openjdk-11.0.11
15:52:35.768 [reactor-http-epoll-2] INFO TRACE - onSubscribe(FluxMap.MapSubscriber)
15:52:35.770 [reactor-http-epoll-2] INFO TRACE - request(unbounded)
15:52:36.089 [reactor-http-epoll-2] INFO TRACE - onNext(evt)
15:52:38.343 [reactor-http-epoll-2] INFO TRACE - onNext(evt)
15:52:39.142 [reactor-http-epoll-2] INFO TRACE - onNext(evt)
...
16:05:46.213 [reactor-http-epoll-2] INFO TRACE - onNext(evt)
16:05:47.234 [reactor-http-epoll-2] INFO TRACE - onNext(evt)
16:05:48.257 [reactor-http-epoll-2] INFO TRACE - onNext(evt)
@violetagg Thank you so much for your support and indeed it seems to work for you I switched to Ubuntu and java 11 to stick to your happy scenario but it is still not working for me :-( Now having : Linux greg-P7xxDM2-G 5.11.0-31-generic #33-Ubuntu SMP Wed Aug 11 13:19:04 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux openjdk version "11.0.11" 2021-04-20 OpenJDK Runtime Environment (build 11.0.11+9-Ubuntu-0ubuntu2) OpenJDK 64-Bit Server VM (build 11.0.11+9-Ubuntu-0ubuntu2, mixed mode, sharing)
May I ask lastly your version used of the following components ? spring-boot-starter-parent 2.5.3 netty-transport-native-epoll-4.1.66.Final-linux-x86_64.jar reactor-netty-http-1.0.9.jar reactor-core-3.4.8.jar reactive-streams-1.0.3.jar
I would like to thank you for your premium support Violeta, that was really kind of you. Anyway, I will continue to investigate on my side
Gregory
I finally managed to find the root cause. After some investigation, I got the returned code 1006 meaning the connection was closed abnormally by the client as documented in the rfc https://datatracker.ietf.org/doc/html/rfc6455#section-7.4.1
1006 is a reserved value and MUST NOT be set as a status code in a
Close control frame by an endpoint. It is designated for use in
applications expecting a status code to indicate that the
connection was closed abnormally, e.g., without sending or
receiving a Close control frame.
At that time, I switched from WIFI connection to LAN connection and the issue vanished immediately. My WIFI router was not able to handle the huge payload correctly. You can close the issue and once again I would like to warmly thank you for your time @violetagg Kind reagrds, Gregory
@greg007r nice that you found the issue!
I have a long running websocket client implemented in java Spring reactor with Netty (spring-boot-starter-parent 2.5.3) targeting Binance ws api. According to specs, the weboscket channel is kept open 24 hours.
The websocket is unexpectedly and prematurely closed after around 3 minutes :
I tried to reproduce the issue using another technology like javascript but everything runs fine. It seems that the channel is closed so I tried to tune the ChannelOptions at TcpClient level... still no luck !
I provided a java sample code to reproduce the issue:
I don't understand why I get completed/terminated event from ReactorNettyWebSocketClient WebSocketHandler ?
I also posted my issue on Stackoverflow https://stackoverflow.com/questions/68792765/reactor-netty-websocket-channel-closed-prematurely
Thank you for your help,