Closed davefoxy closed 2 years ago
Hi @davefoxy,
thanks for the report. We will try to reproduce the issue you're experiencing and will come back with a follow-up.
Best, Boris
Hey, @davefoxy
could you follow this doc that discusses when & how to pass in the tokens to our client? eg:
tokenProvider
during ChatClient
's initializationToken
to connectUser
API. NOTE: the tokenProvider
closure passed in the init will be called everytime the given token expires.ConnectionStatus
for your own purposes is not recommended and can lead to more bugs. Our client leverages reconnection strategy under the hood that will try to keep your user connected to the chat.Please come back to us and tell us if the above helped you to resolve your issue.
Thanks.
Hi @bielikb thanks for your reply 🙇
So the code path should be the same between me giving the token provider through the tokenProvider
property (as in my code above) and passing it in whilst instantiating ChatClient
. However, I've tried your suggestion just in case but there's still no change.
The error message is actually coming from here: https://github.com/GetStream/stream-chat-swift/blob/develop/Sources/StreamChat/WebSocketClient/Engine/URLSessionWebSocketEngine.swift#L78
So there's no reconnection happening at this point. The code just logs an error and does nothing.
Is it possible maybe it's not token expiration that's causing this?
Also, about this point:
Hooking on the connection status / mutating the ConnectionStatus for your own purposes is not recommended and can lead to more bugs. Our client leverages reconnection strategy under the hood that will try to keep your user connected to the chat.
The connectionStatus
part of my original code is not manipulating Stream's connectionStatus
. This is our own enumerable for keeping track of overall setup. It's not touching Stream. However, I would be curious how we can monitor Stream's internal connectionStatus
properly. I mentioned this in the original post under the "Update" section.
One last thing; idling on an open channel, I've just received this error for the first time:
[ERROR] [com.apple.NSURLSession-delegate] [RequestDecoder.swift:64] [decodeRequestResponse(data:response:error:)] > API request failed with status code: 400, code: 4 response:
{
"code" : 4,
"message" : "GetOrCreateChannel failed with error: \"Watch or Presence requires an active websocket connection, please make sure to include your websocket connection_id\"",
"more_info" : "https:\/\/getstream.io\/chat\/docs\/api_errors_response",
"StatusCode" : 400,
"duration" : "0.00ms"
})
Hi @bielikb thanks for your reply 🙇
So the code path should be the same between me giving the token provider through the
tokenProvider
property (as in my code above) and passing it in whilst instantiatingChatClient
. However, I've tried your suggestion just in case but there's still no change.The error message is actually coming from here: https://github.com/GetStream/stream-chat-swift/blob/develop/Sources/StreamChat/WebSocketClient/Engine/URLSessionWebSocketEngine.swift#L78
So there's no reconnection happening at this point. The code just logs an error and does nothing.
Is it possible maybe it's not token expiration that's causing this?
When leveraging tokenProvider
are you able to connect and eg show the list of existing channels in your app? Does it successfully connect? Could you try to reproduce your issue leveraging our DemoApp on our main repo?
Also, about this point:
Hooking on the connection status / mutating the ConnectionStatus for your own purposes is not recommended and can lead to more bugs. Our client leverages reconnection strategy under the hood that will try to keep your user connected to the chat.
The
connectionStatus
part of my original code is not manipulating Stream'sconnectionStatus
. This is our own enumerable for keeping track of overall setup. It's not touching Stream. However, I would be curious how we can monitor Stream's internalconnectionStatus
properly. I mentioned this in the original post under the "Update" section.
You can observe connectionStatus
changes via adding ChatConnectionControllerDelegate
conformance to your class + setting your instance as the delegate of ChatConnectionController
.
class YouClass: ChatConnectionControllerDelegate {
var connectionController: ChatConnectionController?
/// your setup code
func setupClient() {
...
// once the client is initialised
connectionController = chatClient.connectionController()
connectionController.delegate = self
}
/// ChatConnectionControllerDelegate conformance
func connectionController(_ controller: ChatConnectionController,
didUpdateConnectionStatus status: ConnectionStatus) {
// observe changes
}
}
as shown here in these docs.
When leveraging tokenProvider are you able to connect and eg show the list of existing channels in your app? Does it successfully connect? Could you try to reproduce your issue leveraging our DemoApp on our main repo?
Yes, I can retrieve all my channels and navigate to them, send messages etc. It just stops working after my token expiration time (15 minutes). I'll try out the demo app once again and double-check it works ok for me.
However, shouldn't there be something here to handle this error? It's just logging right now: https://github.com/GetStream/stream-chat-swift/blob/develop/Sources/StreamChat/WebSocketClient/Engine/URLSessionWebSocketEngine.swift#L78
Update:
I checked out the demo app again... It doesn't seem to use a tokenProvider
at all: https://github.com/GetStream/stream-chat-swift/blob/develop/DemoApp/DemoAppCoordinator.swift#L92
However, shouldn't there be something here to handle this error? It's just logging right now: https://github.com/GetStream/stream-chat-swift/blob/develop/Sources/StreamChat/WebSocketClient/Engine/URLSessionWebSocketEngine.swift#L78
Ill create internally task for us. Thanks for pointing that out ;)
Update: I checked out the demo app again... It doesn't seem to use a
tokenProvider
at all: https://github.com/GetStream/stream-chat-swift/blob/develop/DemoApp/DemoAppCoordinator.swift#L92
Yes, that's correct. Feel free to adjust the init call to match your integration/initialization.
@davefoxy trying to reproduce this on my end but token renewal seems to work fine. Do you mind sharing the latest version of the code that you are using?
To make things a bit simpler to debug, I create token on the app directly and simulate a delay in between.
The following code is absolutely not suitable for a production app but it might help you reducing the scope of the problem. The JWT token is created directly in iOS and expires after 15 seconds (+ we fake a 3 second delay to observe the token renew state)
chatClient.shared.tokenProvider = { completion in
DispatchQueue.main.asyncAfter(deadline: .now() + 3) {
let token = generateUserToken(secret: apiKeySecretString, userID: userID, exp: Int(Date().timeIntervalSince1970) + 15)
completion(.success(token))
}
}
and this is the code I use to generate token with a short expiration time:
import Foundation
import CryptoKit
import StreamChat
extension Data {
func urlSafeBase64EncodedString() -> String {
return base64EncodedString()
.replacingOccurrences(of: "+", with: "-")
.replacingOccurrences(of: "/", with: "_")
.replacingOccurrences(of: "=", with: "")
}
}
struct Header: Encodable {
let alg = "HS256"
let typ = "JWT"
}
struct JWTPayload: Encodable {
let user_id: String
let exp:Int
}
// DO NOT USE THIS FOR REAL APPS! This function is only here to make it easier to
// have expired token renewal while using the standalone demo application
func generateUserToken(secret: String, userID: String, exp: Int) -> Token {
let privateKey = SymmetricKey(data: secret.data(using: .utf8)!)
let headerJSONData = try! JSONEncoder().encode(Header())
let headerBase64String = headerJSONData.urlSafeBase64EncodedString()
let payloadJSONData = try! JSONEncoder().encode(JWTPayload(user_id: userID, exp: exp))
let payloadBase64String = payloadJSONData.urlSafeBase64EncodedString()
let toSign = (headerBase64String + "." + payloadBase64String).data(using: .utf8)!
let signature = HMAC<SHA256>.authenticationCode(for: toSign, using: privateKey)
let signatureBase64String = Data(signature).urlSafeBase64EncodedString()
let token = [headerBase64String, payloadBase64String, signatureBase64String].joined(separator: ".")
return try! Token.init(rawValue: token)
}
@tbarbugli Thanks! That's super-useful for debugging. Let me play with it a little bit and see. My code hasn't changed since what's in the initial message of this issue but at least this code you've provided will allow me to remove the variable that is our own token fetching code.
I'll get back to you once I experiment a bit more. It's approaching the weekend here so it might be some time before I get back to you 🙇
@davefoxy how is it going with this? happy to move this to a quick call if that helps
@tbarbugli Sorry I'm a little late getting back to you on this. After the first message in this thread, I implemented our own token refreshing code but obviously being able to use tokenProvider
is preferable. I'd like to just check a few other things in our codebase socket-related that might be causing an issue and I'll get back to you a little later this week on it. If I'm still struggling then yes, a call would be fantastic. Thanks for offering that as an option 🙇
@davefoxy how is this going for you? AFAICT implementing your own token refresh + reconnect chat is very tricky. The SDK does some smart things around token expiration such as queuing requests and replay them when a fresh token is available.
How does your implementation look like?
Hi @tbarbugli so the process I have in place is:
activeToken
and make the initial connectUser
call.EventsController
and listen for its delegate's didReceiveEvent
method. Note here that I tried to use ConnectionController
but its delegate methods weren't being fired.ConnectionStatusUpdated
and it's of the disconnected
type, I check to see if the stored activeToken
is expired or not and if it is, fetch a new one and call ChatClient
's setToken
method.It seems to work ok but yes, as you said, there's a lot of potential edge cases that might be missed.
Maybe a call would be best but yes, I'd like to just draw your attention one more time to this message: https://github.com/GetStream/stream-chat-swift/issues/1920#issuecomment-1099251936.
The error we're receiving is this specific line and it doesn't refresh the token or do anything but print an error: https://github.com/GetStream/stream-chat-swift/blob/develop/Sources/StreamChat/WebSocketClient/Engine/URLSessionWebSocketEngine.swift#L85
bielikb has said he's logged an issue on your end but I'm not sure of the progress.
@tbarbugli Any progress on the above message and refreshing tokens within URLSessionWebSocketEngine
?
Hi @davefoxy, Sorry to keep you waiting. We're working on a fix for this but cannot give you an ETA yet.
We'll keep you posted. Thank you!
Hi @davefoxy ,
First things first. The line in URLSessionWebSocketEngine that you pointed out should have nothing to do with this issue.
An error coming from the websockets comes with so little information that those issues are actioned through messages instead.
Whenever there is a problem with the token, we would receive first a success
event that contains a message similar to the following one:
{\"error\":{\"code\":40,\"message\":\"JWTAuth error: token is expired (exp)\",\"StatusCode\":401,\"duration\":\"\",\"more_info\":\"\"}}"
This basically tells us that there is an issue with the token, and that we should refresh it.
If you follow the path, this would end up calling ChatClient.webSocketClient(_: didUpdateConnectionState:)
. In here, as you can see in the following chunk, we refresh the token.
case let .disconnected(source):
if let error = source.serverError,
error.isInvalidTokenError {
refreshToken(completion: nil)
shouldNotifyConnectionIdWaiters = false
} else {
shouldNotifyConnectionIdWaiters = true
}
connectionId = nil
Whenever the token is refreshed we recreate the websocket connection, which leads to a call to ChatClientUpdater.connect(userInfo:completion:)
.
I hope this helps you visuallize the flow, and find if there are any differences you are having in it.
That said, while investigating this case, I found out one issue. In this case, the refreshToken function in ChatClient might be executed twice. The first time because of the disconnection of the websocket, and the following one because of a failure coming from the APIClient. But in any case, this has never been an issue during my tests.
After verifying it further, it is only happening as an edge case, and should not be the root cause of your issue.
Please let us know if the flow stated above is the same as the one you have.
@polqf Thanks for the update. So yes, when the socket connection disconnects, I am falling into the disconnected
state as you mentioned above. However, the refresh never happens because error.isInvalidTokenError
is false
(Ref: https://github.com/GetStream/stream-chat-swift/blob/develop/Sources/StreamChat/ChatClient.swift#L642)
Here is the result when I po source.serverError
:
WebSocketEngineError(reason: "The operation couldn’t be completed. Socket is not connected", code: 57, engineError: Optional(Error Domain=NSPOSIXErrorDomain Code=57 "Socket is not connected" UserInfo={NSErrorFailingURLStringKey=wss://chat-proxy-us-east.stream-io-api.com/connect?api_key=[redacted]&json=%7B%22user_details%22:[redacted],%22server_determines_connection_id%22:true,%22user_id%22:[redacted], NSErrorFailingURLKey=wss://chat-proxy-us-east.stream-io-api.com/connect?api_key=[redacted]&json=%7B%22user_details%22:[redacted],%22server_determines_connection_id%22:true,%22user_id%22:[redacted], _NSURLErrorRelatedURLSessionTaskErrorKey=(
"LocalWebSocketTask <70F87855-A1EA-409E-94DD-36862C08EC03>.<1>"
), _NSURLErrorFailingURLSessionTaskErrorKey=LocalWebSocketTask <70F87855-A1EA-409E-94DD-36862C08EC03>.<1>}))
Perhaps this range is incorrect? Sorry, I'm not so familiar with web socket error codes but this range won't catch my error above.
EDIT:
Actually, digging into this more with breakpoints, this line is always false because underlyingError as? ErrorPayload
always fails to cast.
Hi @davefoxy ! This looks interesting 🤔
underlyingError
should be castable to ErrorPayload. If that does not happen, that's why isInvalidTokenError
is false.
One thing that is important here is that you don't follow the trace starting from the error on the Websocket client, but instead start following from the last successful message you receive, which should have a format like this:
{\"error\":{\"code\":40,\"message\":\"JWTAuth error: token is expired (exp)\",\"StatusCode\":401,\"duration\":\"\",\"more_info\":\"\"}}"
Please let me know if you get that message, and try to follow the execution from there 🙏 . As I shared before, starting to follow the trace from the moment you receive an error is not what we want, as those errors don't provide information.
Perhaps this range is incorrect? Sorry, I'm not so familiar with web socket error codes but this range won't catch my error above.
The codes we have in that range are our own codes, sent by our backend in the last successful message (see above), not the ones Apple uses. And WebSocketEngineError is just a wrapper around Apple's error, for which we are not looking at the codes.
@polqf Please give me file and line numbers where you'd like me to breakpoint on and trace from. Just to make sure I'm properly aiming in the direction you need.
Just as soon as you receive the payload I shared above, please check where the trace leads you.
Hi @davefoxy,
we kicked new release 4.17.0 out the door. In 4.16.0
we provided new tokenProvider parameter that can be passed directly to connectUser
API.
Could you grab the latest version of our SDK and see if leveraging the new tokenProvider
API resolves your issue(s)?
Thanks!
Best, Boris
Hi @davefoxy, there's been some inactivity here. We are closing this issue for now, let us know if there's anything we can help you with
hi - @polqf i am currently facing same issue in latest SDK, I have also checked the demo app it give same error and never reconnects. as mentione above. i have attached the log SS.
Hi @SSaleemSSI , the issue you are exposing seems different than the one outlined in this issue. Could you list the steps to reproduce it?
Hi - @polqf, Thanks for the reply. I disconnected the network from Mobile and reconnect it. And banner for connected never triggered. error only appears in logs when i try sync. or App auto sync after sometime.
Hi @SSaleemSSI,
This issue usually means you forgot to call synchronize()
in a controller. It is not related to this issue.
Best, Nuno
Hi - @nuno-vieira How your demo app producing same result. Plus i also double check the sync call i am calling it. But have you tried to reproduce this issue, disconnect network and connect it again. Everything works fine sending messages as well but when App tried to sync then this socket error comes. I can show u the video if u want me to reproduce it with the demoApp. I also downloaded the latest SDK and demoApp code from main it also shows same error.
Hi - @nuno-vieira @polqf Here is the recording with the error producing after demoApp calls the sync when i navigate around the chats. I took little time but when App tried to sync its message is clear that error connecting socket while trying to sync. I have also attached the logs after disconnecting and reconnecting the wifi. https://www.loom.com/share/f3feaa15827247ea839f4b168280586d Logs.zip
Hi @SSaleemSSI!
Thank you for the videos. We will investigate this Next Monday morning.
Best, Nuno
Hi @SSaleemSSI , after investigating this for I while, I can actually confirm that, when using the simulator, the reachability components are not working properly, and thus we are not always reconnecting properly. We are using Apple's NWPathMonitor
, so it is not an issue on our side as far as we've investigated.
When using a physical device, these issues don't appear anymore for me. Could you please confirm that on your side?
PS. There are many posts like this: https://developer.apple.com/forums/thread/713330
@polqf Thanks for the reply, Yes it works fine on the mobile device.
Hello, I am using the SPM for SwiftUI and developing a chat. I was getting the same issue as davefoxy. After chatting for a while in the app, the chat list freezes and user is not able to navigate inside a channel nor able to connect user. If I call the ChatClient.shared.connectUser(), there is no callback happening and hence unable to handle the error. I am using tokenProvider for refreshing the token. While debugging into the SDK found that on calling connectUser(), the flow reaches AuthenticationRepository -> private func scheduleTokenFetch() and returns completion call but the callback is never fired at my end. Could you please help me in this regard ?
Thanks Arun
What did you do?
Opening a connection to
ChatClient
usingconnectUser
and providing atokenProvider
. After a time, I started to get the following in the terminal and a chat is no longer updated:What did you expect to happen?
If this is due to the token expiring, I expected my
tokenProvider
to be called but it doesn't look like it is. The whole "expiring token" thing might be a red herring though.What happened instead?
The above error is output and any open channels stop receiving real-time messages. Calling
synchronize
on its channel controller will reload it but we don't get real-time messages coming back.GetStream Environment
GetStream Chat version: 4.13.1 GetStream Chat frameworks: StreamChat, StreamChatSwiftUI iOS version: 15.4 Swift version: 5 Xcode version: 13.3 Device: Simulator and iPhone 13 Pro
Additional context
Here's my connection code:
Our initial connection code:
One more thing; we are using Apollo iOS (GraphQL client) version 0.51.0. I'm not sure it makes a difference but this library has its own instance of StarScream included.
The initial connection seems totally ok, it's just reconnecting. Hoping to get a solution soon. Thanks.
Update
Trying to find a workaround, I was wondering if maybe I can just observe the connection status and refresh the token manually when it disconnects. I see this in
ChatClient
:So I did as the comment says but there doesn't seem to be an observable for the connection status on
CurrentChatUserController
. Just ones forcurrentUserChangePublisher
andunreadCountPublisher
.