Open fbarbat opened 2 years ago
We have a similar issue since iOS15. It seems to work 99% of the time, but once in a while, the reconnection does not happen.
This keeps happening to us. It's been like 3 times I hit this myself since I reported the issue. Our users are hitting it a lot too.
Hey all, apologies for the slow response. Do you have any reliable repro steps or are you able to enable logging and share the logs when this bug occurs?
This is happening to me as well, running locally at the moment, haven't tested this issue in prod, but every time my macbook sleeps, the realtime connection is ended and I have to refresh the browser to reconnect. I am using for user presence as well so it becomes an issue when any mac users in my app encounter this bug expecting to see others online to communicate.
Hey @morganchen12, sorry about the delayed response on my side this time. I couldn't reproduce it myself but we have reports that it is still happening to users. Unfortunately we don't have any reliable repro steps. When you say enable logging is just doing Database.setLoggingEnabled(true)
? I will try that. Let me know if there's anything else we can try. Thanks.
Yep, or by calling this method.
Ok, I could reproduce it and collect logs and other information!
In my experience, it is more likely to hit this issue on Monday mornings, that is, after having my Mac sleeping during the weekend with the app running. So last Friday at the end of the day I set up debug mode for Realtime Database, started recording logs left the app running attached to Xcode and put my mac to sleep.
Yesterday morning, while the mac was sleeping the app tried reconnecting to the Realtime Database. This behavior is normal as far as I know: even if the computer is "sleeping" the app would wake app every now and then for some seconds and try to reconnect to servers because it doesn't know the mac "is actually sleeping". Some seconds later, it actually goes to sleep so the app stop running.
2022-06-05 08:44:21.249940-0300 0xdc7174 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB082014] Realtime connection established
2022-06-05 08:44:21.251504-0300 0xdc7c99 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB034003] On ready
2022-06-05 08:44:21.251590-0300 0xdc7c99 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB034041] Handling timestamp: 1654429461722
2022-06-05 08:44:21.300121-0300 0xdc7c99 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB034033] Calling restore state
2022-06-05 08:44:21.300225-0300 0xdc7c99 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB034035] Restoring auth
2022-06-05 08:44:21.300324-0300 0xdc7c99 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB082004] Sending data (contents hidden)
2022-06-05 08:44:21.300432-0300 0xdc7c99 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB065002] Raising value event on connected
2022-06-05 08:44:21.463667-0300 0xdc7c99 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB083006] (wsc:29) handleNewFrameCount: 1
2022-06-05 08:44:21.463730-0300 0xdc7c99 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB083007] (wsc:29) handleIncomingFrame sending complete frame: 0
2022-06-05 08:44:21.463794-0300 0xdc7c99 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB082010] Got data message: {
[SOME STUFF]
}
2022-06-05 10:07:56.067777-0300 0xdc79c6 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB082004] Sending data (contents hidden)
2022-06-05 10:07:56.069026-0300 0xdc79c6 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB083014] (wsc:29) resetting keepalive, to 2022-06-05 13:08:41 +0000 ; old: 2022-06-05 11:45:06 +0000
The problem is that, during these spontaneous wake ups, there are high chances that we are not connected to the network, so connection issues might happen:
2022-06-05 10:07:56.108435-0300 0xdc79c6 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB083010] (wsc:29) didFailWithError didFailWithError: Error Domain=NSPOSIXErrorDomain Code=54 "Connection reset by peer" UserInfo={_kCFStreamErrorCodeKey=54, _kCFStreamErrorDomainKey=1}
2022-06-05 10:07:56.108510-0300 0xdc79c6 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB083013] Websocket is closing itself
2022-06-05 10:07:56.108573-0300 0xdc79c6 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB082007] Realtime connection lost.
2022-06-05 10:07:56.108632-0300 0xdc79c6 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB082002] Closing realtime connection.
2022-06-05 10:07:56.109244-0300 0xdc79c6 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB034004] Got on disconnect due to other
2022-06-05 10:07:56.109370-0300 0xdc79c6 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB034008] Scheduling connection attempt
2022-06-05 10:07:56.109444-0300 0xdc79c6 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB065002] Raising value event on connected
2022-06-05 10:07:56.109504-0300 0xdc79c6 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB038019] Running onDisconnectEvents
2022-06-05 10:07:56.109564-0300 0xdc79c6 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB034009] Trying to fetch auth token
2022-06-05 10:07:56.109706-0300 0xdc79c6 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB034011] Successfully fetched token, opening connection
2022-06-05 10:07:56.109764-0300 0xdc79c6 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB083001] (wsc:30) Connecting to: wss://s-usc1f-nss-2565.firebaseio.com/.ws?somestuff
2022-06-05 10:07:56.109829-0300 0xdc79c6 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB082001] Calling open in FConnection
2022-06-05 10:07:56.109893-0300 0xdc79c6 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB083002] (wsc:30) FWebSocketConnection open.
As there were network problems, opening this connection timed out. And the Realtime Database client hanged here.
2022-06-05 11:47:00.281060-0300 0xdc79c6 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB083012] (wsc:30) Websocket timed out on connect
Today I started my workday, opened my mac and this got logged.
2022-06-06 11:17:09.814642-0300 0xdcd704 Debug 0x285b42c 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB065002] Raising value event on connected
I checked server side state and confirmed this issue was happening: I was offline for the rest of the world but my app was running and had a good Internet connection.
Using an internal debugging feature I tried to interact with the Realtime Database, and this is what I got:
2022-06-06 11:24:46.647367-0300 0xdce6c2 Debug 0x285cb8a 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB038003] Setting: /some_document with {
value = 1;
} pri: <null>
2022-06-06 11:24:46.655124-0300 0xdce6c2 Debug 0x285cb8a 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB034025] Wasn't connected or writes paused, so added to outstanding puts only. Path: /some_document
This confirmed previous observations on this issue. The Realtime Database client was "stuck". Using Xcode, I checked two Realtime Database client internal bits of state:
empty
ConnectionStateConnecting
So it looks like the app hanged in this ConnectionStateConnecting
state after the network timeout. I checked the client code and, from what I could see, the Realtime Database client doesn't have any retry logic in place to recover from these errors.
In the past, I had seen the app recovering itself after several hours so I'm still waiting to collect logs in case that could be helpful. I still have Xcode attached so if you want to debug it with me now, let me know. :)
I have more logs if that might be useful to you but they have some (slightly) sensitive data so I am fine with sending them to you directly instead of attaching them here publicly if you need them.
Thank you!
Ok, after around 2 hours, the app reconnected. These are the logs:
2022-06-06 13:08:25.944123-0300 0xdd7096 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB083010] (wsc:30) didFailWithError didFailWithError: Error Domain=NSOSStatusErrorDomain Code=-9806 "errSSLClosedAbort: connection closed via error " UserInfo={_kCFStreamErrorCodeKey=-9806, _kCFStreamErrorDomainKey=3}
2022-06-06 13:08:25.944279-0300 0xdd7096 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB083013] Websocket is closing itself
2022-06-06 13:08:25.944348-0300 0xdd7096 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB082006] Realtime connection failed.
2022-06-06 13:08:25.944701-0300 0xdd7096 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB082002] Closing realtime connection.
2022-06-06 13:08:25.944767-0300 0xdd7096 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB034004] Got on disconnect due to other
2022-06-06 13:08:25.944840-0300 0xdd7096 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB034008] Scheduling connection attempt
2022-06-06 13:08:25.944899-0300 0xdd7096 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB054002] Scheduling retry in 0.618143s
2022-06-06 13:08:26.098224-0300 0xdd7096 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB038019] Running onDisconnectEvents
2022-06-06 13:08:26.620248-0300 0xdd7096 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB034009] Trying to fetch auth token
2022-06-06 13:08:26.621511-0300 0xdd7096 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB034011] Successfully fetched token, opening connection
2022-06-06 13:08:26.622945-0300 0xdd7096 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB083001] (wsc:32) Connecting to: wss://x.firebaseio.com/.ws?somestuff
2022-06-06 13:08:26.625894-0300 0xdd7096 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB082001] Calling open in FConnection
2022-06-06 13:08:26.626043-0300 0xdd7096 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB083002] (wsc:32) FWebSocketConnection open.
2022-06-06 13:08:27.733870-0300 0xdd7085 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB083008] (wsc:32) webSocketDidOpen
2022-06-06 13:08:36.556371-0300 0xdd7085 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB083009] (wsc:32) nop timer kicked off
2022-06-06 13:08:36.556481-0300 0xdd7085 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB083006] (wsc:32) handleNewFrameCount: 1
2022-06-06 13:08:36.556550-0300 0xdd7085 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB083007] (wsc:32) handleIncomingFrame sending complete frame: 0
2022-06-06 13:08:36.556614-0300 0xdd7085 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB082011] Got control message: { some stuff}
2022-06-06 13:08:36.557386-0300 0xdd7085 Debug 0x0 16384 0 MyAppName: (GoogleUtilities) 8.15.0 - [Firebase/Database][I-RDB082014] Realtime connection established
Then it properly connected again.
Note I had a breakpoint so that might explain the delay in 13:08:36.
Let me know if you need more information. Thanks!
Thanks, it sounds like the right way to fix this is to just retry on socket connection failures, respecting the NSWorkspace
state. I'll take a crack at implementing this when I have the bandwidth.
Thanks!
You might also consider the timeout case. I think the socket error in this case was 2 hours late so it's not safe to wait for it unless the SDK can force it to happen earlier somehow.
Alternatively, could the SDK discard the timed out web socket (and future notifications) and try a new one?
Maybe it'd be better to deliberately terminate the socket connection on sleep and restore it on wakeup then? @IanWyszynski what do you think?
Thanks! In my opinion, terminating the connection on sleep and restore it on wake up might reduce the chances of this from happening, but it doesn't fix the root problem. The main problem is that the SDK doesn't retry after the timeout (it just logs an error), it waits for the socket error and that might take a lot of time (if it actually happens).
I think terminating the connection on sleep and restore it on wake up might mitigate it though so it's a good workaround we can do on our side in the meantime. We are going to implement this behavior to try to reduce the chances of it from happening by invoking goOffline
when the computer goes to sleep, and invoking goOnline
when the computer wakes up. This way, network problems happening while the computer is sleeping shouldn't affect us.
Having said that, this won't fix the root problem since network issues might happen anyway when the computer is awake, specifically when starting the app and the SDK wants to setup the connection, so it would be great if the SDK can handle connection timeouts and retries more gracefully.
Hi, this issue is more widespread than just macos, it definitely happens on iOS as well in ionic cordova apps (environment info below).
A couple observations and research items of note that may help:
It's quite a bit of effort to setup a new 'minimal' repo, ionic, angularfire and a subscription to a listval with detection events on resume/background. If someone else has that handy then go for it, I hope this post helps someone else so they don't spend days searching and experimenting!
Note: using modular angularfire.
ENVIRONMENT
Angular:
Angular CLI: 13.3.10
Node: 14.20.0
Package Manager: yarn 1.22.15
OS: darwin x64
Angular: 13.3.12
... animations, common, compiler, compiler-cli, core, forms
... language-service, platform-browser, platform-browser-dynamic
... router, service-worker
Package Version
---------------------------------------------------------
@angular-devkit/architect 0.1303.6
@angular-devkit/build-angular 13.3.10
@angular-devkit/core 13.3.10
@angular-devkit/schematics 13.3.10
@angular/cli 13.3.10
@angular/fire 7.5.0
@schematics/angular 13.3.10
rxjs 6.6.7
typescript 4.6.2
webpack 5.70.0
Ionic:
Ionic CLI : 6.20.3 (/Users/markterrill/.config/yarn/global/node_modules/@ionic/cli)
Ionic Framework : @ionic/angular 6.2.4
@angular-devkit/build-angular : 13.3.10
@angular-devkit/schematics : 13.3.10
@angular/cli : 13.3.10
@ionic/angular-toolkit : 3.1.1
Capacitor:
Capacitor CLI : 4.5.0
@capacitor/android : 4.5.0
@capacitor/core : 4.1.0
@capacitor/ios : 4.5.0
Cordova:
Cordova CLI : not installed
Cordova Platforms : not available
Cordova Plugins : not available
Utility:
cordova-res (update available: 0.15.4) : 0.15.3
native-run : 1.7.1
System:
Android SDK Tools : 26.1.1 (/Users/markterrill/Library/Android/sdk)
NodeJS : v14.20.0 (/Users/markterrill/.nvm/versions/node/v14.20.0/bin/node)
npm : 6.14.17
OS : macOS Monterey
Xcode : Xcode 14.2 Build version 14C18
How is this still an issue? Seems to be a pretty critical bug imo...
Hello, this is still an issue on both Firebase and Firestore. Not only that, but, it's found in the web & Android versions, not just iOS.
Do we have to implement a watchdog or an app lifecycle event ourselves for something this critical? There have been countless times where I get a push notification while the app is on the background / sleep, I open it and the data from Firestore never arrives without restarting the app.
this cost me a lot of negative reviews on my app, I guess we need to remove rtdb. I tried @markterrill solution listening the lifecycle of the app but it doesn't work, rtdb never recovers and it doesn't show an error or something in debug window.
In my case, when the app goes offline then online the listener to ".info/connected" is never called again. This means when the app reconnects to WiFi again, my app is not able to update its online status again in the realtime database.
Hello, I had a similar problem while using Firebase web modular API with Capacitor on iOS. Each time a tried a write operation on the database (with set()), and a network change occurred meanwhile, the database API got stuck forever. My listener to ".info/connected" was never called. I think a found a workaround:
import { Injectable } from '@angular/core';
import { getApp } from 'firebase/app';
import { getDatabase, goOffline, goOnline, onValue, ref } from 'firebase/database';
import { Network } from '@capacitor/network';
@Injectable({
providedIn: 'root'
})
export class DatabaseService {
public connectedToDatabase: boolean;
private app = getApp();
private database = getDatabase(this.app, "...firebasedatabase.app/");
constructor() {
// Get network status changes
// Network status changes when switching between wifi, cellular and offline
Network.addListener('networkStatusChange', status => {
console.log('Network status changed', status);
if (status.connected == true) {
// We need to go offline first, because realtime database can get stuck when switching between wifi and cellular
goOffline(this.database);
goOnline(this.database);
}
if (status.connected == false) {
goOffline(this.database);
}
});
// Now the callback on connectedRef is triggered at each network change
// And write operations like (set()) don't get stuck anymore
const connectedRef = ref(this.database, ".info/connected");
onValue(connectedRef, (snap) => {
if (snap.val() === true) {
this.connectedToDatabase = true;
} else {
this.connectedToDatabase = false;
}
});
}
}
@UZ8VLqifK36Z Thanks for sharing. I'm going to give this a try at some point with my React app. I just discovered this window event, which hopefully works the same.
// https://developer.mozilla.org/en-US/docs/Web/API/Navigator/onLine
window.addEventListener("offline", () => {});
window.addEventListener("online", () => {});
Hey there, having the same issue here, react native firebase though. I believe it's related.
Does Firebase 11.x make a difference?
A new socket implementation was introduced in 10.27.0
This issue may be related to #13877
I'll try downgrade my version before 10.27.0 and see if that improves things :)
Issue does seem related!
Hey @fbarbat. We need more information to resolve this issue but there hasn't been an update in 5 weekdays. I'm marking the issue as stale and if there are no new updates in the next 5 days I will close it automatically.
If you have more information that will help us get to the bottom of this, just add a comment!
[REQUIRED] Step 1: Describe your environment
[REQUIRED] Step 2: Describe the problem
Steps to reproduce: Have the mac going to sleep. After waking up, sometimes (with really low frequency), the Realtime Database client would stay offline indefinitely, even if network conditions are good.
We have an app that uses Realtime Database for implementing Presence in Firestore. https://firebase.google.com/docs/firestore/solutions/presence. If we have the app running and the mac goes to sleep, sometimes, after waking up, the Realtime Database client doesn't reconnect.
We have strong evidence that
database.reference(withPath: ".info/connected").observe
reported afalse
value while the mac was sleeping but it never reportedtrue
again, even after waking up and having good network conditions.If we use any operation while Realtime database is offline, like statusReference.setValue to write to the database, the operation never runs the completion callback.
Firestore is able to reconnect properly without problems but Realtime Database client is not working properly.
We saw one case that recovered by itself after a some hours (12 hours after disconnected, 2 hours after the mac woke up). But as we need the Realtime Database to implement Presence, that delay is not acceptable.
Restarting the app makes it work again.