okTurtles / group-income

A decentralized and private (end-to-end encrypted) financial safety net for you and your friends.
https://groupincome.org
GNU Affero General Public License v3.0
331 stars 44 forks source link

Problems joining group over slow mobile connection #2368

Closed taoeffect closed 1 month ago

taoeffect commented 1 month ago

Problem

While trying to test PR #2367 using the ssh -R 80:localhost:8000 nokey@localhost.run command (plus grunt dev), I wasn't able to join a group on my phone.

I encountered two different errors each time I tried joining. The first time I encountered an error similar to the screenshot below, but it said something about there being an error trying to get the server time. The second time it said this:

a

After this error, I was stuck on the Pending Joining group page.

d

Attempting to re-join using the link didn't fix anything. In fact it told me I was already part of the group:

c

I didn't get the logs for the first servertime error, but think I managed to get them for the second (JSON related) error: b

📎 gi_logs.json.txt

Solution

Investigate what the bug is, and try to make sure that the joining process is robust to error.

I'm guessing that the problem is related to the much slower connection.

It should be robust to both these errors:

BTW, to test this you will need to use that localhost.run service. Just run that command and then follow the instructions to sign up for an account. You just need either an an SSH key and maybe an email (it's free). Then re-run the command, I think.

Note that this could be an issue also with the connection dropping sometimes in the middle of the join, as sometimes I experienced the tunnel re-creating a new random endpoint.

corrideat commented 1 month ago

BTW, to test this you will need to use that localhost.run service.

This is issue is real, but I wonder if it could be related to the service. For example, it could randomly drop connections or terminate responses early. I'd test this using a proper rate-limited connection (e.g., using the web browser tools or a virtual NIC with a slow network)

corrideat commented 1 month ago

Adding to this issue:

  1. The error mentioned ("Unexpected end of JSON input" happens when JSON.parse is called with an empty string (JSON.parse(''))). I don't know why this happens, but it could be either a bug in the app (or the server), or something with the tunnel itself (like it prematurely closing connections)
  2. I wasn't able yet to reproduce this particular issue (using the setup suggested)
  3. I was, however, able to reproduce the error related to fetching server time. While maybe we should fix this somehow (e.g., some backoff), it's not a critical error.
  4. I also got various instances of the 'reconnecting' banner, which is an open issue. I don't know why (again, this could be the existing known issue with the app, or something the tunnel is doing).
taoeffect commented 1 month ago

@corrideat yeah, it might be something the tunnel is doing, but ultimately nothing that the tunnel does should result in a permanently broken state in the app. The app should be able to recover upon page refresh, but it doesn't. While testing this myself I was eventually able to join the group after trying enough times in a fresh context (i.e. a fresh random URL -> fresh context). But the broken ones never recovered.

corrideat commented 1 month ago

The app should be able to recover upon page refresh, but it doesn't

I think we've discussed this in another similar issue (see https://github.com/okTurtles/group-income/issues/2183). While there may be issues to fix here (unsure, unable to know exactly without reproducing) some issues we can't recover from without re-syncing contracts.

Now, in this case, what appears to have happened is that some call or calls to gi.actions/xxx failed. Those are difficult to recover from since we have no mechanism for doing that at the moment. Those either happen in side-effects (which are run once) or in actions after some user input (which are also run once).

When it comes to joining a group, it's a fairly complex process with many actions. I've already tried to reduce the number of outgoing messages, but I maybe more can be done. However, the issue is that fixing errors sending these would require either storing actions and re-trying later (a significant refactor) or maybe discarding the group if the process fails.

TL;DR: We need to fix this, but it's several things to fix, and some of them don't have an obvious solution.

but ultimately nothing that the tunnel does should result in a permanently broken state in the app

Not sure if it applies to this situation, but I disagree. The tunnel is essentially a MITM that can send invalid responses (or say, as may be the case here, truncate responses). As we rely on a server, not sure we can do much about this, other than failing more gracefully or getting rid of the server.

taoeffect commented 1 month ago

The tunnel is essentially a MITM that can send invalid responses (or say, as may be the case here, truncate responses).

The app must be coded in a way that handles connection drops. It's a bug otherwise.

As we rely on a server, not sure we can do much about this, other than failing more gracefully or getting rid of the server.

For the case of errors in sideEffects, #2110 is a solution.

corrideat commented 1 month ago

The app must be coded in a way that handles connection drops. It's a bug otherwise.

Again, difficult to know without reproducing, but this doesn't look like a connection drop, rather like an invalid response. E.g., you fetch https://example.com/test.json and the response is `` (empty). If that's what's happening, the app is correctly handling it by saying it got an invalid response (it's a bug but on the server). Now, if there was a connection drop and the app interpreted that as an empty response, then that's a bug in the app.

(However, as you say, even if the server gives an invalid response, we should avoid the app being in a limbo state if we can help it.)

taoeffect commented 1 month ago

While retrying on PR #2365, after signing up u1 I got stuck on the Pending joining group screen. EDIT: note, this is on desktop connected to localhost, not mobile via tunnel

In the console there was an error message about failing to setup the service worker, some issue with sendMsg I think (forgot to take a screenshot). EDIT: Just able to reproduce, though this time I was able to join successfully, even though this same error appeared right after sign up of u1:

Screenshot 2024-10-04 at 11 23 50 AM Screenshot 2024-10-04 at 11 23 53 AM

Then I refreshed the page and saw these errors:

Screenshot 2024-10-04 at 11 10 25 AM Screenshot 2024-10-04 at 11 10 34 AM

taoeffect commented 1 month ago

Hopefully closed in #2382