1iveowl / WebsocketClientLite.PCL

Websocket client lite
MIT License
38 stars 14 forks source link

ConnectionTimeout - Slack #17

Closed Workshop2 closed 7 years ago

Workshop2 commented 7 years ago

Hello,

I am trying to find a .Net Core compatible replacement for the websocket client for my Slack library http://github.com/noobot/slackconnector

When trying to connect to the Slack RTM I get a connection timeout (e.g. wss://mpmulti-q2ip.slack-msgs.com/websocket/xxxxxxxxyyyyyy)

Code: https://github.com/noobot/SlackConnector/blob/porting-websockets-take-3/src/SlackConnector/Connections/Sockets/WebSocketClientLite.cs

Any idea what I am doing wrong?

System.TimeoutException : Connection request to server timed out
   at WebsocketClientLite.PCL.MessageWebSocketRx.<ConnectAsync>d__25.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
   at SlackConnector.Connections.Sockets.WebSocketClientLite.<Connect>d__7.MoveNext() in C:\code\slackconnector\src\SlackConnector\Connections\Sockets\WebSocketClientLite.cs:line 32
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
   at SlackConnector.SlackConnector.<GetWebSocket>d__6.MoveNext() in C:\code\slackconnector\src\SlackConnector\SlackConnector.cs:line 64
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
   at SlackConnector.SlackConnector.<Connect>d__5.MoveNext() in C:\code\slackconnector\src\SlackConnector\SlackConnector.cs:line 48
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
   at SlackConnector.Tests.Integration.SlackConnectorTests.<should_connect_and_stuff>d__1.MoveNext() in C:\code\slackconnector\src\SlackConnector.Tests.Integration\SlackConnectorTests.cs:line 24
   at NUnit.Framework.AsyncInvocationRegion.AsyncTaskInvocationRegion.WaitForPendingOperationsToComplete(Object invocationResult)
   at NUnit.Core.NUnitAsyncTestMethod.RunTestMethod()
1iveowl commented 7 years ago

No sure. I don't know Slack very well.

I assume that you have the authentication etc. covered? rtm.start

Workshop2 commented 7 years ago

@1iveowl yes, we do a handshake to get the WSS url (https://github.com/noobot/SlackConnector/blob/porting-websockets-take-3/src/SlackConnector/Connections/Clients/Handshake/FlurlHandshakeClient.cs#L11)

It works with WebSocketSharp, but I love the simplicity of this client but I can't get it to work 😢

1iveowl commented 7 years ago

Ok. Thanks.

I'd love to try and debug it, but I don't feel like trying figure out how set up the hand-shake up etc. Would it be possible to share a WSS URL that I can try and connect too, or does it time-out before I would be able too?

Alternatively, maybe you could add a simple .NET Core console project to your solution where you do the hand-shake stuff all the way to the _websocket.ConnectAsync(_uri) and I could take it from there to try and sort out why it times out?

Or something like that?

One question. Does Slack expect a certain version of TLS? The default for WSS I use in WebsocketClientLite is TLS 1.2.

1iveowl commented 7 years ago

Apologies closed the issue by mistake.

One more thought. You need to adskilte the Socket lite.PCL Nuget to you project. It works with Bait and Switch. Have you added this?

Workshop2 commented 7 years ago

I will try creating a command prompt app for you.

What is adskilte? Not heard of that

1iveowl commented 7 years ago

Thank you.

"Adskilte" is the danish work for "separated". In this context however is it an auto-correct mistake made by my iPhone ;-)

The right sentence should have been: "You need to add the SocketLite.PCL...".

Anyhow, I believe after thinking about it that SocketLite.PCL is already added as the WebsocketClientLite Nuget takes care of this. Anyhow, you might want to double check that it is there.

Workshop2 commented 7 years ago

@1iveowl I am just hacking together a demo console app - how can I give you an api key to test with? Email? Etc

1iveowl commented 7 years ago

Thank you. Please send it to [Removed email address]

How long will the api key last? Will it timeout?

Workshop2 commented 7 years ago

API Key has been sent, here is a branch of SlackConnector with a console app you can build quickly: https://github.com/noobot/SlackConnector/tree/demo-console-for-debugging

Note, you will need to look at WebSocketClientLite and you should see it timeout trying to connect

1iveowl commented 7 years ago

I believe I found the bug.

Try using NuGet: WebsocketClientLite.PCL 3.6.4-beta2.

Let me know if it works?

Workshop2 commented 7 years ago

It connects - brilliant 😄. I can see messages coming through which is amazing (will do some more testing now).

I did just have the following unhandled exception bubble up:

System.Exception was unhandled
Message: An unhandled exception of type 'System.Exception' occurred in System.Reactive.PlatformServices.dll
Additional information: Websocket connection aborted unexpectantly. Check connection and socket security version/TLS version)

I am on a mobile WiFi, however, I didn't use to get aborted connections on other clients and unfortunately it didn't trigger the ObserveConnectionStatus event - any idea?

1iveowl commented 7 years ago

I can see what is wrong.

Websockets keeps a connection alive by having the server sometimes sending ping's to the client. The client is then suppose to send a pong back, which it also does.

For reasons I do not understand yet, the Slack RTM server decides to cut the connection immidiantly after the client sends the pong. I suppose it does not like how the pong looks. I will investigate. I have not had this problem before with any other websocket server.

Workshop2 commented 7 years ago

Wow - thanks, I really appreciate the help.

Let me know if I can do anything to help you :thumbsup:

1iveowl commented 7 years ago

It seems that Slack has it's own ping/pong implementation.

Try and look here under the heading "Ping and Pong".

Others have run in to the same issue it seems from this Stack Overflow post.

Could you try and implement this way of doing ping/pong and see if it fixes things?

I've looked at the WebsocketClientLite Websocket layer implementation and I've not found any deviations from the spec: RFC 6544. Ping/Pong is quite simple:

5.5.2. Ping

The Ping frame contains an opcode of 0x9.

A Ping frame MAY include "Application data".

Upon receipt of a Ping frame, an endpoint MUST send a Pong frame in response, unless it already received a Close frame. It SHOULD respond with Pong frame as soon as is practical. Pong frames are discussed in Section 5.5.3.

An endpoint MAY send a Ping frame any time after the connection is established and before the connection is closed.

NOTE: A Ping frame may serve either as a keepalive or as a means to verify that the remote endpoint is still responsive.

5.5.3. Pong

The Pong frame contains an opcode of 0xA.

Section 5.5.2 details requirements that apply to both Ping and Pong frames.

A Pong frame sent in response to a Ping frame must have identical "Application data" as found in the message body of the Ping frame being replied to.

If an endpoint receives a Ping frame and has not yet sent Pong frame(s) in response to previous Ping frame(s), the endpoint MAY elect to send a Pong frame for only the most recently processed Ping frame.

1iveowl commented 7 years ago

I can confirm that the above is what is needed. I just tried the following hack and now the connection no longer time out.

I'm sure that there is a more elegant way to implement this ;-)

await _webSocket.ConnectAsync(_uri);

while (true)
{
    await Task.Delay(TimeSpan.FromSeconds(25));
    await _webSocket.SendTextAsync("{\r\n    \"id\": 1234, // ID, see \"sending messages\" above\r\n    \"type\": \"ping\",\r\n    ...\r\n}");
}
1iveowl commented 7 years ago

Correction.

It will fail if a websocket ping is send from the server before the Slack ping is send.

To work around this I will implement a new parameter where the native websocket ping/pong can be disabled.

1iveowl commented 7 years ago

Slack seems to be much more sensitive than Websocket.org.

I believe I found a bug on the pong anyway.

Try using NuGet: WebsocketClientLite.PCL 3.6.4-beta3.

1iveowl commented 7 years ago

Version 3.6.5 have been released.

In my tests everything seem to Work now with Slack.

Please confirm?

1iveowl commented 7 years ago

Argh. This is tricky.

Apparently Slack fails if the pong includes the byte specifying the length of "Application Data" if the lenght is zero. This behavior does not seem to be inline with RFC 6455, however I'd admit the RFC is a bit ambigious on this.

On the other hand. If I obmit the byte defining the zero length of the "Application Data". Then the websocket.org server will fail!

To top this the slack connection will eventually fail after approx. 2-3 minutes unless a "slack ping" (see my earlier commment about this above) is also being send with regular intervals. So you should probably take care of this too in your part of the code.

In NuGet v3.6.6 I have implemented a optional connect parameter to override the Websocket.org "default" RFC 6455 behavior:

await _webSocket.ConnectAsync(_uri, excludeZeroApplicationDataInPong:true);

Please let me know how this works

Workshop2 commented 7 years ago

Wow - thanks for all the amazing support. I have just tried it and things seem much more stable.

Been busy at work but will try and look into Ping/Pong and see how it handles while not on a mobile internet connection.

Thanks again 👍

1iveowl commented 7 years ago

Looking forward to hear if it works now and if I can close this issue.

Workshop2 commented 7 years ago

Thanks for everything @1iveowl - I will post back once I have more feedback

1iveowl commented 7 years ago

Fixed another bug related to the size of the Websocket payload. If the payload is bigger than 126 bytes the connection would be closed due to malformed message.

Likely something that might also be influencing your scenario

Workshop2 commented 7 years ago

Thanks @1iveowl