Open WalrusSoup opened 3 months ago
@WalrusSoup thank you for sharing information about this issue. It looks like the WKWebView way of handling JS context is the reason for what is happening. There are two ways to solve it:
window.onbeforeunload
- this can be done differently:
a. The user handles this case and destroys the current SDK client / Chat SDK instance
b. If the web version of JS SDK used, it can add onbeforeunload
handler and destroy itself.For 2.b.
I will need to speak with the team about whether it will affect anyone if we will destroy by default.
WKWebView does not seem to reliably trigger onbeforeunload
. Should this be a bug report to webkit then since it's so easily and reliably recreated?
I managed to get it functional by hitting it from all sides in our app - but jeeze. This doesn't feel great.
if (import.meta.hot && !window.importHook) {
import.meta.hot.on('vite:beforeFullReload', () => {
if (window.chat) {
// @ts-ignore
window.chat.sdk.destroy();
}
});
window.importHook = true;
}
reload via hotkey:
if (!window.refreshHook) {
document.addEventListener('keydown', async (e) => {
e = e || window.event;
if (e.keyCode == 116) {
window.chat.sdk.destroy();
window.location.reload();
}
});
}
context menu needs to be replaced with one that can hook reload, so we disabled it.
document.addEventListener('contextmenu', function (e) {
e.preventDefault();
});
Is there any progress on this?
Our Customers also run into this Issue. The Regression was introduced in Version 8.2.7. When I look at the Diff there is not much that changed. How could that cause the Error?
We had to rollback because of Customer Complaints.
@parfeon Any updates on this?
@WalrusSoup as I mentioned on #396 that because of suggested fix (when browser forcefully closes connection) Web Fetch API doesn't provide much information about error, and we can only assume using error message (in most of the cases). To properly handle this case, subscription timeout should be configured for PubNub instance, so it will recreate idle
request on its own before it will be forcefully terminated by the browser.
Fixes from #396 have been reverted in 8.2.9 and there is even newer version available already: 8.3.0
Our Fix:
We've patched this in our own project by listening for a Tauri navigation event. For browsers, right now, i would suggest listening to both pagehide
and beforeunload
and calling chat.sdk.destroy(true)
or pubnub.destroy(true)
window.addEventListener('beforeunload', () => {
// yeet the sdk
});
window.addEventListener('pagehide', (event) => {
// yeet the sdk
});
The runaway GC still remains an issue as soon as any channel connection is opened. I suspect that we're looking at an upstream node-fetch issue.
What's even crazier is you can open the connection and refresh the page without connecting to pubnub and the page still goes into a GC spiral.
I loaded it the first time, commented out the pubnub items for the second run, then gave it a refresh. Safari still confidently holding on to a ton of functions from within the sdk.
And again it's still pretty trivial to reproduce:
import { Chat } from "../js-chat/lib/src/"
Chat.init({
publishKey: import.meta.env.VITE_PUBNUB_PUB_KEY,
subscribeKey: import.meta.env.VITE_PUBNUB_SUB_KEY,
userId: "user_1111",
subscribeRequestTimeout: 59,
}).then(async (chat) => {
console.log('chat is init. im gonna refresh now i guess');
const currentTimestamp = new Date().getTime();
const secondUser = await chat.createUser(`user_${currentTimestamp}`, {
name: `User ${currentTimestamp}`,
profileUrl: "https://www.example.com/profile.png"
});
const { channel } = await chat.createDirectConversation({
user: secondUser,
channelId: `direct-user-${secondUser.id}-${currentTimestamp}`
});
channel.join((msg) => {
console.log(msg);
})
});
Also would like to point out - this bug will trip also from any navigation event away from the page when the back button is used.
Running into an issue where a simple pubnub chat causes Safari to eat up 90% CPU when any channels have been joined. I've isolated it to opening a connection to channels. As soon as I call connect, it will have this issue after refreshing. It wont happen every refresh, but it happens 90% of the time.
Safari Version:
Version 17.5 (19618.2.12.11.6)
Prior to refreshing, a lot of errors pop up in the console (~10 times):
Fetch API cannot load [x] due to access control checks.
This occurs even in a simple boilerplate page with nothing on it. This is from a vanilla vite project compilation, with an empty page:
I do not see the errors Fetch API cannot load [x] due to access control checks when reloading via the developer console.
https://github.com/user-attachments/assets/c53f290e-e7d7-4a19-94db-1d357d112b4f
If you force the browser to disconnect before reloading, we don't see the same behavior:
On the development side, since we're in a Webkit view, i cannot seem to fix it for a reload. But i can at least prevent it from happening during hot module reloads:
However the webkit environment does not fire the events
onbeforeunload
etc when refreshing which means this bug is basically stuck.On reload, it appears that the timers that were scheduled are simply running forever. I set up a setTimeout() to simply disconnect Pubnub and set the object to undefined and the timers still hang out forever.
If I load, then reload the window and call destroy() on the current instance:
When we compare snapshots we see that Safari seems to be hanging onto the SDK and throwing partial GCs hundreds of times:
Likewise, when calling
destroy()
on the SDK underneath, we still hang onto memory because there is no destroy() function on the ChatSDK. I can manually clean up both and THAT appears to fix the issue. However, this does not work when reloading. This must be called before we reload the window.Perhaps I am missing something here?