bookwyrm-social / bookwyrm

Social reading and reviewing, decentralized with ActivityPub
http://joinbookwyrm.com/
Other
2.23k stars 263 forks source link

Following a user on another instance silently fails #1687

Closed Tak closed 1 year ago

Tak commented 2 years ago

Describe the bug I tried to follow a user on another instance. It seemed like the follow had succeeded, but I'm not receiving any of their activity, and they don't show up in my "following" list. Overall, the behavior seems very similar to #895 .

To Reproduce Steps to reproduce the behavior:

Expected behavior I begin seeing their activity in my feed, and they show up in my following list.

Instance My instance: https://reading.taks.garden (bookwyrm bdf617d005d2c8948579170ccf345bdb016511b3) Their instance: https://bookwyrm.cincodenada.com

Additional context I've tried unfollowing and refollowing I also tried a remote follow / Follow on Fediverse, same result

Tak commented 2 years ago

The flower log from a follow request:

bookwyrm.models.activitypub_mixin.broadcast_task | 0eb16a01-a182-4dfd-8d85-a0269f533278 | SUCCESS | (2,  '{"id": "https://reading.taks.garden/user/Tak#follows/33", "type":  "Follow", "actor": "https://reading.taks.garden/user/Tak", "object":  "http://bookwyrm.cincodenada.com/user/WelshPixie", "@context":  "https://www.w3.org/ns/activitystreams"}',  ['http://bookwyrm.cincodenada.com/user/WelshPixie/inbox']) | {} | None | 2021-12-17 08:30:27.985 | 2021-12-17 08:30:28.008 | 1.241 | celery@27e881e21b4b
-- | -- | -- | -- | -- | -- | -- | -- | -- | --
mouse-reeve commented 2 years ago

The fact that they don't show up in your follow list suggests to me that the follow request went out, but you either never received the acceptance, or there was a problem parsing the accept that you did receive. A few things to check:

  1. Have you had similar issues following users on other instances of bookwyrm or mastodon?
  2. Do you see an Accept activity in your flower logs that corresponds to that request? It would come in as an inbox.activity_task
  3. Does the user you tried to follow appear in an approval pending state, where the follow button says "Undo follow request"? If so, does undoing the request and re-trying help?
  4. Does the user you tried to follow see you in their list of followers?
Tak commented 2 years ago
  1. No - I'm only following a handful of people, but I haven't seen any other issues
  2. No, I don't see a corresponding accept activity
  3. No, it just shows Unfollow
  4. Checking…
Tak commented 2 years ago
  1. No, it appears not
cincodenada commented 2 years ago

Hello! As you might guess by my username I'm the admin of the target instance here - thanks for raising the issue!

I can't dig in too much until later today but I hopped on to grab some logs and do a little poking around and it looks like we responded with a 405 to what I assume is your follow request, which suggests we may have bungled the response somehow:

bookwyrm.access.log:<ip removed> - - [17/Dec/2021:08:30:29 +0000] "GET /user/WelshPixie/inbox HTTP/1.1" 405 0 "-" "python-requests/2.22.0 (BookWyrm/0.1.0; +https://reading.taks.garden/)"

I tried logging into my own test account on bookwyrm.social and following myself and WelshPixie on bookwyrm.cincodenada.com and they seem to be in the same state as above - it looks like I'm following, but they don't show up in my following list.

I tried watching the logs on my end while following and unfollowing a few times in succession, and didn't see anything relevant come through, oddly. I wasn't able to do a remote follow from bookwyrm.social but I was from my Mastodon account, and I saw activity for those (without errors), but still no activity. I'll check back in later tonight and do some more digging when I have more time, but that's what I've got so far!

cincodenada commented 2 years ago

Oh! One more thing I just noticed that in your log the requests against bookwyrm.cincodenada.com is HTTP instead of HTTPS, and I did just explicitly turn HTTPS on a couple days ago...I wonder if something else needs to be adjusted on my end still?

cincodenada commented 2 years ago

Okay so I really do have to come back to this later, BUT I checked one last thing and confirmed that WelshPixie's user account had an http:// prefix for all the URL columns in my user table, presumably because it was created after I upgraded but before I set USE_HTTPS to true - it looks like commit 247a7f74896e made it so that that setting actually affected the user in the database, where previously it unconditionally used https://.

I've updated the URLs to use HTTPS now, but I still can't follow from my bookwyrm.social account - I don't know if this is actually an issue or just a red herring, but it's possible that the remote databases still have the http:// URL for their copy of the user, which could possibly be causing issues. All conjecture at this point though!

This evening (UTC-8) I'll create a new test user and see if I can follow it from bookwyrm.social, to see if new users have any better luck. If y'all wanna try in the meantime, here's a invite URL you can use to create users on bookwrym.cincodenada.com: https://bookwyrm.cincodenada.com/invite/XXFTNVFY

Tak commented 2 years ago

That does make sense, given the http 405 response

cincodenada commented 2 years ago

A couple more datapoints from a little more poking over lunch: I think the 405 response is actually correct, because GET is not a valid method for /user/UserName/inbox - the only handler at that endpoint is for a POST. Also I don't think the http things should be an issue unless the requestor isn't following redirects - I tested requesting http://bookwyrm.cincodenada.com/user/WelshPixie/inbox and correctly got a 301 to the https version, which then returned a 405 (because it was a GET).

Separately: I need to read up on ActivityPub more, but I imagine there's likely to be some sort of announcement I can make from my end that will clue remote servers in about updating the user profile (and thus the URLs) - does anyone here know what that is off the top of their head?

mouse-reeve commented 2 years ago

Oh yes, the http/https situation does make sense! Thank you for looking into it, @cincodenada.

Separately: I need to read up on ActivityPub more, but I imagine there's likely to be some sort of announcement I can make from my end that will clue remote servers in about updating the user profile (and thus the URLs) - does anyone here know what that is off the top of their head?

When a user profile gets saved, it sends an Update activity, but I'm not sure how that would handle changing the urls

cincodenada commented 2 years ago

So I've been digging into this and got a little sidetracked by other things but here's where I landed:

Strangely, when I try to follow my account from reading.taks.garden, my server POST to my inbox as expected, but when I try to follow WelshPixie's account, I consistently get a GET to my inbox (which of course doesn't work and returns a 405).

@Tak would you mind checking what WelshPixie's user record looks like in your database? Either via the db shell, or the ./bw-dev shell python shell:

>>> from bookwyrm.models import User
>>> from django.core import serializers
>>> serializers.serialize('json', User.objects.filter(name="WelshPixie"))

Maybe email the output of a db query or that script to me at bookwyrm@cincodenada.com? It doesn't have anything sensitive I don't think (local users have a hashed password, but I don't think remote users have even that), but no reason to be splatting it into a Github issue either.

cincodenada commented 2 years ago

Okay, another update: @Tak sent me the user record and as I suspected, WelshPixie's account has http:// for all its endpoints, which are no longer valid. I've been doing some testing between my instance and a local instance, and it turns out that the GET request is because POST requests can't be redirected - so reading.taks.garden tries to hit my server at http://, gets redirected to https://, and then tries to GET the inbox endpoint (because POST can't follow the redirect), which fails.

I've been trying to figure out if I can just trigger a user update to update the URLs to https - I think I should be able to, I just haven't quite gotten my ducks in a row yet to see if that has the effect I want. Worst case I think if the remote instances delete their user record for WelshPixie it should reload correctly, but I'd like to clean things up from my end if I can, so don't do anything from y'all's end quite yet. Thanks!

cincodenada commented 2 years ago

Also, as noted in the title, the failure is one part of the issue, but another I think is the silence of the failure - maybe we could be louder about it if the POST to the followee's inbox fails (as it is in this case), for instance? I'm not sure what, like, the prescribed behavior there is.

cincodenada commented 2 years ago

Okay so I didn't have time to dig too deep tonight but with a little creative use of the interactive shell, I was able to send an update to reading.taks.garden for WelshPixie's user, which should have patched up the bad inbox URLs. I was then able to follow WelshPixie from my test account on reading.taks.garden, so hopefully I've resolved the immediate issue!

@Tak, can you try following WelshPixie from your instance again? Just unfollow if necessary and refollow, you shouldn't have to do anything else I don't think.

The root of the issue here was that my server was briefly misconfigured such that it broadcast http:// URLs as inbox URLs, but wasn't actually accessible via HTTP. I'd like to think a bit more about how we might be more robust about this, and if it's possible to recover from this kind of thing automatically.

So this is my best understanding of where the disconnect was:

To break this loop, in the console I hotpatched User.get_recipients to return just https://reading.taks.garden/inbox and then broadcast a user update, to force it to send an update to reading.taks.garden.

My very sketchy thoughts so far on how we could automatically recover from something like this: on startup, could we check for any local users whose inbox/outbox URLs are out of sync with the instance URL, and if so, update the user and then broadcast a user update to all the servers we've seen? I imagine we'd run into scaling problems perhaps, updating all users at once like that, but for small instances (where these kinds of issues are more likely to crop up), it seems doable.

Also note this is just the inbox, shared_inbox, outbox, and maybe followers_url fields - I think we want to leave remote_id alone because it's the, well, id of the user. WelshPixie's remote_id still has http in it, and I think that's fine and as it should be, if I understand correctly? remote_id shouldn't actually be used as a URL anywhere, right, it's just an identifier?

Tak commented 2 years ago

Unfollow/refollow does appear to have worked now. @mouse-reeve Should I close the issue, or is there something better bookwyrm could be doing to handle this edge case?

cincodenada commented 2 years ago

Great, thanks for checking on that! Hurrah!

I did a little more digging to answer my question about remote_id and found out I'm quite incorrect there, in that ActivityPub is apparently actually quite specific about id's being URLs. In order to avoid propagating the error further I went ahead and updated WelshPixie's user id and sent out an update which, to my surprise, just worked! I had feared it might slightly break things on your end, but upon reading the code more closely, it worked because when receiving updates, BookWyrm will update a user as long as any of the unique remote ID fields (inbox, outbox, follower_url) match - so even though remote_id had changed, the other URLs were still the same, so it happily updated the remote_id. Rad! I doubt this is the use case that was in mind when setting it up that way, but it came in handy. I was also able to update the PrivateKey id, and verified that updates are still being published, so I didn't break anything there. Phew!

As for lingering things: I think now that the immediate situation is resolved, this issue could move towards at least signaling to the user that something went wrong if a follow request fails, focusing on the "silent" side of things if we want?

The remote_id thing complicates my scheming about resolving this kind of situation automatically, because you'd have to do a two-step process, since I don't think we can update them all at once, or there won't be anything to match to the remote user...anyway, if we think that's worth pursuing I think it should be spun out into a separate ticket.

One last bit of cleanup here: I'm curious what bookwyrm.cincodenada.com's connector record looks like in your database, @Tak - I suspect it also has http URLs, which I guess wouldn't break anything because the GETs will follow the 301s, but may be worth updating them to https anyway. Here's the script to get that record, since I had it handy from my own digging:

>>> from bookwyrm.models import Connector
>>> from django.core import serializers
>>> serializers.serialize("json", models.Connector.objects.filter(base_url__contains="reading.taks.garden"))
Tak commented 2 years ago
[{
  "model": "bookwyrm.connector", 
  "pk": 17, 
  "fields": {
    "created_date": "2021-12-13T08:41:39.070Z", 
    "updated_date": "2021-12-13T08:41:39.073Z", 
    "remote_id": "https://reading.taks.garden/connector/17", 
    "identifier": "bookwyrm.cincodenada.com", 
    "priority": 2, 
    "name": null, 
    "connector_file": "bookwyrm_connector", 
    "api_key": null, 
    "active": true, 
    "deactivation_reason": null, 
    "base_url": "https://bookwyrm.cincodenada.com", 
    "books_url": "https://bookwyrm.cincodenada.com/book", 
    "covers_url": "https://bookwyrm.cincodenada.com/images/covers", 
    "search_url": "https://bookwyrm.cincodenada.com/search?q=", 
    "isbn_search_url": null
  }
}]
cincodenada commented 2 years ago

Okay, that actually looks fine, cool, thanks! I think that wraps up the last of the loose ends with this particular instance, with state of this ticket TBD

mouse-reeve commented 2 years ago

Thank you for debugging this! I haven't had a chance to read everything fully and figure out how to avoid this problem in the future, but I'll take a look after christmas

mouse-reeve commented 1 year ago

I'm going to close this because it's very old and we've done a lot of bug fixes since then. If there are further problems, they would warrant new issues