Closed Tak closed 1 year ago
The flower log from a follow request:
bookwyrm.models.activitypub_mixin.broadcast_task | 0eb16a01-a182-4dfd-8d85-a0269f533278 | SUCCESS | (2, '{"id": "https://reading.taks.garden/user/Tak#follows/33", "type": "Follow", "actor": "https://reading.taks.garden/user/Tak", "object": "http://bookwyrm.cincodenada.com/user/WelshPixie", "@context": "https://www.w3.org/ns/activitystreams"}', ['http://bookwyrm.cincodenada.com/user/WelshPixie/inbox']) | {} | None | 2021-12-17 08:30:27.985 | 2021-12-17 08:30:28.008 | 1.241 | celery@27e881e21b4b
-- | -- | -- | -- | -- | -- | -- | -- | -- | --
The fact that they don't show up in your follow list suggests to me that the follow request went out, but you either never received the acceptance, or there was a problem parsing the accept that you did receive. A few things to check:
inbox.activity_task
Unfollow
Hello! As you might guess by my username I'm the admin of the target instance here - thanks for raising the issue!
I can't dig in too much until later today but I hopped on to grab some logs and do a little poking around and it looks like we responded with a 405 to what I assume is your follow request, which suggests we may have bungled the response somehow:
bookwyrm.access.log:<ip removed> - - [17/Dec/2021:08:30:29 +0000] "GET /user/WelshPixie/inbox HTTP/1.1" 405 0 "-" "python-requests/2.22.0 (BookWyrm/0.1.0; +https://reading.taks.garden/)"
I tried logging into my own test account on bookwyrm.social
and following myself and WelshPixie on bookwyrm.cincodenada.com
and they seem to be in the same state as above - it looks like I'm following, but they don't show up in my following list.
I tried watching the logs on my end while following and unfollowing a few times in succession, and didn't see anything relevant come through, oddly. I wasn't able to do a remote follow from bookwyrm.social
but I was from my Mastodon account, and I saw activity for those (without errors), but still no activity. I'll check back in later tonight and do some more digging when I have more time, but that's what I've got so far!
Oh! One more thing I just noticed that in your log the requests against bookwyrm.cincodenada.com
is HTTP instead of HTTPS, and I did just explicitly turn HTTPS on a couple days ago...I wonder if something else needs to be adjusted on my end still?
Okay so I really do have to come back to this later, BUT I checked one last thing and confirmed that WelshPixie's user account had an http://
prefix for all the URL columns in my user table, presumably because it was created after I upgraded but before I set USE_HTTPS
to true
- it looks like commit 247a7f74896e made it so that that setting actually affected the user in the database, where previously it unconditionally used https://
.
I've updated the URLs to use HTTPS now, but I still can't follow from my bookwyrm.social
account - I don't know if this is actually an issue or just a red herring, but it's possible that the remote databases still have the http://
URL for their copy of the user, which could possibly be causing issues. All conjecture at this point though!
This evening (UTC-8) I'll create a new test user and see if I can follow it from bookwyrm.social
, to see if new users have any better luck. If y'all wanna try in the meantime, here's a invite URL you can use to create users on bookwrym.cincodenada.com: https://bookwyrm.cincodenada.com/invite/XXFTNVFY
That does make sense, given the http 405 response
A couple more datapoints from a little more poking over lunch: I think the 405 response is actually correct, because GET is not a valid method for /user/UserName/inbox
- the only handler at that endpoint is for a POST. Also I don't think the http things should be an issue unless the requestor isn't following redirects - I tested requesting http://bookwyrm.cincodenada.com/user/WelshPixie/inbox
and correctly got a 301 to the https version, which then returned a 405 (because it was a GET).
Separately: I need to read up on ActivityPub more, but I imagine there's likely to be some sort of announcement I can make from my end that will clue remote servers in about updating the user profile (and thus the URLs) - does anyone here know what that is off the top of their head?
Oh yes, the http/https situation does make sense! Thank you for looking into it, @cincodenada.
Separately: I need to read up on ActivityPub more, but I imagine there's likely to be some sort of announcement I can make from my end that will clue remote servers in about updating the user profile (and thus the URLs) - does anyone here know what that is off the top of their head?
When a user profile gets saved, it sends an Update
activity, but I'm not sure how that would handle changing the urls
So I've been digging into this and got a little sidetracked by other things but here's where I landed:
WelshPixie@cincodenada.bookwyrm.social
from that just fine. I have not been able to unfollow, howeverreading.taks.garden
. I am able to follow and unfollow my account (cincodenada@bookwyrm.cincodenada.com
) but have not been able to follow WelshPixie@bookwyrm.cincodenada.com
.Strangely, when I try to follow my account from reading.taks.garden
, my server POST to my inbox as expected, but when I try to follow WelshPixie's account, I consistently get a GET to my inbox (which of course doesn't work and returns a 405).
@Tak would you mind checking what WelshPixie's user record looks like in your database? Either via the db shell, or the ./bw-dev shell
python shell:
>>> from bookwyrm.models import User
>>> from django.core import serializers
>>> serializers.serialize('json', User.objects.filter(name="WelshPixie"))
Maybe email the output of a db query or that script to me at bookwyrm@cincodenada.com? It doesn't have anything sensitive I don't think (local users have a hashed password, but I don't think remote users have even that), but no reason to be splatting it into a Github issue either.
Okay, another update: @Tak sent me the user record and as I suspected, WelshPixie's account has http://
for all its endpoints, which are no longer valid. I've been doing some testing between my instance and a local instance, and it turns out that the GET
request is because POST
requests can't be redirected - so reading.taks.garden
tries to hit my server at http://
, gets redirected to https://
, and then tries to GET
the inbox endpoint (because POST
can't follow the redirect), which fails.
I've been trying to figure out if I can just trigger a user update to update the URLs to https - I think I should be able to, I just haven't quite gotten my ducks in a row yet to see if that has the effect I want. Worst case I think if the remote instances delete their user record for WelshPixie it should reload correctly, but I'd like to clean things up from my end if I can, so don't do anything from y'all's end quite yet. Thanks!
Also, as noted in the title, the failure is one part of the issue, but another I think is the silence of the failure - maybe we could be louder about it if the POST to the followee's inbox fails (as it is in this case), for instance? I'm not sure what, like, the prescribed behavior there is.
Okay so I didn't have time to dig too deep tonight but with a little creative use of the interactive shell, I was able to send an update to reading.taks.garden
for WelshPixie's user, which should have patched up the bad inbox URLs. I was then able to follow WelshPixie from my test account on reading.taks.garden
, so hopefully I've resolved the immediate issue!
@Tak, can you try following WelshPixie from your instance again? Just unfollow if necessary and refollow, you shouldn't have to do anything else I don't think.
The root of the issue here was that my server was briefly misconfigured such that it broadcast http://
URLs as inbox URLs, but wasn't actually accessible via HTTP. I'd like to think a bit more about how we might be more robust about this, and if it's possible to recover from this kind of thing automatically.
So this is my best understanding of where the disconnect was:
reading.taks.garden
had a user record for WelshPixie with unreachable http URLs, which it got during the period my server was misconfiguredreading.taks.garden
, because no one on reading.taks.garden
was following WelshPixie (because they couldn't!)To break this loop, in the console I hotpatched User.get_recipients
to return just https://reading.taks.garden/inbox
and then broadcast a user update, to force it to send an update to reading.taks.garden
.
My very sketchy thoughts so far on how we could automatically recover from something like this: on startup, could we check for any local users whose inbox/outbox URLs are out of sync with the instance URL, and if so, update the user and then broadcast a user update to all the servers we've seen? I imagine we'd run into scaling problems perhaps, updating all users at once like that, but for small instances (where these kinds of issues are more likely to crop up), it seems doable.
Also note this is just the inbox
, shared_inbox
, outbox
, and maybe followers_url
fields - I think we want to leave remote_id
alone because it's the, well, id of the user. WelshPixie
's remote_id
still has http
in it, and I think that's fine and as it should be, if I understand correctly? remote_id
shouldn't actually be used as a URL anywhere, right, it's just an identifier?
Unfollow/refollow does appear to have worked now. @mouse-reeve Should I close the issue, or is there something better bookwyrm could be doing to handle this edge case?
Great, thanks for checking on that! Hurrah!
I did a little more digging to answer my question about remote_id
and found out I'm quite incorrect there, in that ActivityPub is apparently actually quite specific about id's being URLs. In order to avoid propagating the error further I went ahead and updated WelshPixie's user id and sent out an update which, to my surprise, just worked! I had feared it might slightly break things on your end, but upon reading the code more closely, it worked because when receiving updates, BookWyrm will update a user as long as any of the unique remote ID fields (inbox, outbox, follower_url) match - so even though remote_id
had changed, the other URLs were still the same, so it happily updated the remote_id
. Rad! I doubt this is the use case that was in mind when setting it up that way, but it came in handy. I was also able to update the PrivateKey id, and verified that updates are still being published, so I didn't break anything there. Phew!
As for lingering things: I think now that the immediate situation is resolved, this issue could move towards at least signaling to the user that something went wrong if a follow request fails, focusing on the "silent" side of things if we want?
The remote_id thing complicates my scheming about resolving this kind of situation automatically, because you'd have to do a two-step process, since I don't think we can update them all at once, or there won't be anything to match to the remote user...anyway, if we think that's worth pursuing I think it should be spun out into a separate ticket.
One last bit of cleanup here: I'm curious what bookwyrm.cincodenada.com
's connector record looks like in your database, @Tak - I suspect it also has http URLs, which I guess wouldn't break anything because the GETs will follow the 301s, but may be worth updating them to https anyway. Here's the script to get that record, since I had it handy from my own digging:
>>> from bookwyrm.models import Connector
>>> from django.core import serializers
>>> serializers.serialize("json", models.Connector.objects.filter(base_url__contains="reading.taks.garden"))
[{
"model": "bookwyrm.connector",
"pk": 17,
"fields": {
"created_date": "2021-12-13T08:41:39.070Z",
"updated_date": "2021-12-13T08:41:39.073Z",
"remote_id": "https://reading.taks.garden/connector/17",
"identifier": "bookwyrm.cincodenada.com",
"priority": 2,
"name": null,
"connector_file": "bookwyrm_connector",
"api_key": null,
"active": true,
"deactivation_reason": null,
"base_url": "https://bookwyrm.cincodenada.com",
"books_url": "https://bookwyrm.cincodenada.com/book",
"covers_url": "https://bookwyrm.cincodenada.com/images/covers",
"search_url": "https://bookwyrm.cincodenada.com/search?q=",
"isbn_search_url": null
}
}]
Okay, that actually looks fine, cool, thanks! I think that wraps up the last of the loose ends with this particular instance, with state of this ticket TBD
Thank you for debugging this! I haven't had a chance to read everything fully and figure out how to avoid this problem in the future, but I'll take a look after christmas
I'm going to close this because it's very old and we've done a lot of bug fixes since then. If there are further problems, they would warrant new issues
Describe the bug I tried to follow a user on another instance. It seemed like the follow had succeeded, but I'm not receiving any of their activity, and they don't show up in my "following" list. Overall, the behavior seems very similar to #895 .
To Reproduce Steps to reproduce the behavior:
Expected behavior I begin seeing their activity in my feed, and they show up in my following list.
Instance My instance: https://reading.taks.garden (bookwyrm bdf617d005d2c8948579170ccf345bdb016511b3) Their instance: https://bookwyrm.cincodenada.com
Additional context I've tried unfollowing and refollowing I also tried a remote follow /
Follow on Fediverse
, same result