dxos-deprecated / halo

HALO
GNU Affero General Public License v3.0
0 stars 1 forks source link

Errors with party (un)subscribing #16

Closed rzadp closed 4 years ago

rzadp commented 4 years ago

The errors described here were recorded on dxos/teamwork branch in rzadp-unsubscrive here, using @dxos/party-manager@1.0.1-beta.18.

Features used:

https://github.com/dxos/teamwork/blob/961f99efb1ad6f274cd0733ca072a41a693f0b1b/apps/teamwork-app/src/components/PartyGroup.js#L99-L107

Video reproduction

3 different kinds of errors are happening, as seen in the vid:

https://drive.google.com/file/d/1yFIGw2yaxe9r2rpg9jjeTWR0VgrwMyjS/view?usp=sharing

First error

Happens randomly when trying to unsubscribe from a party. At 0:10 in the vid.

AssertionError: undefined == true at PartyManager.unsubscribe (webpack-internal:///../../node_modules/@dxos/party-manager/dist/es/party-manager.js:434:25) at onUnsubscribe (webpack-internal:///./src/components/PartyGroup.js:138:31) at HTMLUnknownElement.callCallback (webpack-internal:///../../node_modules/react-dom/cjs/react-dom.development.js:188:14) at Object.invokeGuardedCallbackDev (webpack-internal:///../../node_modules/react-dom/cjs/react-dom.development.js:237:16) at invokeGuardedCallback (webpack-internal:///../../node_modules/react-dom/cjs/react-dom.development.js:292:31) at invokeGuardedCallbackAndCatchFirstError (webpack-internal:///../../node_modules/react-dom/cjs/react-dom.development.js:306:25) at executeDispatch (webpack-internal:///../../node_modules/react-dom/cjs/react-dom.development.js:389:3) at executeDispatchesInOrder (webpack-internal:///../../node_modules/react-dom/cjs/react-dom.development.js:414:5) at executeDispatchesAndRelease (webpack-internal:///../../node_modules/react-dom/cjs/react-dom.development.js:3278:5) at executeDispatchesAndReleaseTopLevel (webpack-internal:///../../node_modules/react-dom/cjs/react-dom.development.js:3287:10)
"AssertionError: undefined == true
    at PartyManager.unsubscribe (webpack-internal:///../../node_modules/@dxos/party-manager/dist/es/party-manager.js:434:25)
    at onUnsubscribe (webpack-internal:///./src/components/PartyGroup.js:138:31)
    at HTMLUnknownElement.callCallback (webpack-internal:///../../node_modules/react-dom/cjs/react-dom.development.js:188:14)
    at Object.invokeGuardedCallbackDev (webpack-internal:///../../node_modules/react-dom/cjs/react-dom.development.js:237:16)
    at invokeGuardedCallback (webpack-internal:///../../node_modules/react-dom/cjs/react-dom.development.js:292:31)
    at invokeGuardedCallbackAndCatchFirstError (webpack-internal:///../../node_modules/react-dom/cjs/react-dom.development.js:306:25)
    at executeDispatch (webpack-internal:///../../node_modules/react-dom/cjs/react-dom.development.js:389:3)
    at executeDispatchesInOrder (webpack-internal:///../../node_modules/react-dom/cjs/react-dom.development.js:414:5)
    at executeDispatchesAndRelease (webpack-internal:///../../node_modules/react-dom/cjs/react-dom.development.js:3278:5)
    at executeDispatchesAndReleaseTopLevel (webpack-internal:///../../node_modules/react-dom/cjs/react-dom.development.js:3287:10)"

Second error

Happens when rejoining a party. 2:10 in the vid.

Error: Already joined swarm: 29fda5ad6ddb722c8fc71e3d5185f8ffa987ed7413aee22ef8060f6ccc770def at NetworkManager.joinProtocolSwarm (webpack-internal:///../../node_modules/@dxos/network-manager/dist/es/network-manager.js:95:13) at PartyManager.openParty (webpack-internal:///../../node_modules/@dxos/party-manager/dist/es/party-manager.js:317:32) at async PartyInfo.eval (webpack-internal:///../../node_modules/@dxos/party-manager/dist/es/party-manager.js:903:11)

Third error

This one seems to be happening more randomly, took quite a while to reproduce this in the recorded video.. Fast forward to 5:25 to see it.

It sometimes happens when refreshing the page and loading everything from scratch, provided that there are some unsubscribed parties.

Uncaught Error: Unhandled error. (undefined)
    at ErrorHandler.emit (events.js?71a2:141)
    at ErrorHandler._this._listener (error-handler.js?5508:44)
party-manager.js?5fbe:878 Uncaught (in promise) TypeError: Cannot read property 'setSettings' of undefined
telackey commented 4 years ago

Can you clean out both ends storage and report?

The first assertion error is from the lack of a Party 'properties' item, which would at least suggest that the party data was old.

The second could easily be caused by the first call failing, since we never unsubscribed to the Party and left the swarm, we cannot rejoin it.

telackey commented 4 years ago

From slack:

@rzadp I just published @dxos/party-manager@1.0.1-beta.20 it has a fix for a race condition so far so good for me, can you try again as well?

rzadp commented 4 years ago

@telackey I have tried out the newest version which is 1.0.1-beta.21

Yeah most of the issues seem to be gone, except one. I have recorded it, it's at 1:50.

https://drive.google.com/file/d/1rVzOSAEusMMqqk24RPMmcaIlDGrpflmU/view?usp=sharing

rzadp commented 4 years ago

@telackey Also I have merged the branch into master, so please be looking at current master of teamwork for this

telackey commented 4 years ago

From slack:


So it looked like the test [that failed] was: reloading toggling unsubscribe/resubscribe as quickly as possible reloading again toggling again as quickly as possible keep repeating the above to see if anything breaks Is that correct?


yes, that was basically the test 10:40 But not really toggling "as quicky as possible", I believe I click resub only after the party displays as gray, and only click unsub if the party is not gray 10:41 (party being gray means we read it's state as unsubscribed)

telackey commented 4 years ago

I think the question to this is how carefully we want to prevent an error like this, and at what level.

As things stand, we begin loading and opening parties from a multi-feed based stream, and it is quite possible to initialize a party before reading that its subscription state is now false, at which point, if it happens to have been opened, we will close it. But closing is triggered by an event independent of the UI, and it also takes a variable amount of time.

What I think is happening in this last scenario is a case where:

  1. The party was initialized and opened.
  2. The subscription=false state was read, and the party closed.
  3. The party was marked closed, but the network-manager was still leaving/closing the associated swarm.
  4. When 'resubscribe' was clicked, requesting network-manager to join the swarm it was also closing.

There are a few ways we might fix this, all of which basically involve deciding when is the right time to trigger an event (eg, in this case, should the update event on the PartyInfo object have been triggered only after obtaining some status from network-manager that it had finished closing?)

I am also not sure of the priority of controlling for all such cases right now. @dboreham ?

dboreham commented 4 years ago

Question: can this issue be entirely worked around by "just not clicking quickly"?

telackey commented 4 years ago

I was about to say, "I think so", but I finally managed to reproduce it, and though I'm not sure why, it doesn't seem just about timing.

In my case I had three parties. I unsubscribed all three. Of those three, if I refresh, I can click two of them 'subscribe' ASAP with no issue. The third fails with this error even after waiting a few seconds. It doesn't matter which order I click them in. The parties that work always work, the party that gives this "already joined" error always gives it.

telackey commented 4 years ago

Always may be too strong a term, because if I refresh and try it several times, it mostly works. But if it happens to give the error, it is consistently that party.

telackey commented 4 years ago

https://github.com/dxos/halo/pull/22

telackey commented 4 years ago

This should be fixed in v1.0.1-beta.24.

telackey commented 4 years ago

@rzadp ^