snarfed / bridgy-fed

🌉 A bridge between decentralized social network protocols
https://fed.brid.gy/
Creative Commons Zero v1.0 Universal
487 stars 28 forks source link

Redirect fed.brid.gy/r/ requests when id is on other subdomain #1160

Open silverpill opened 6 days ago

silverpill commented 6 days ago

Example of a request:

curl -H "Accept: application/activity+json" https://fed.brid.gy/r/https://bsky.app/profile/did:plc:3ljmtyyjqcjee2kpewgsifvb/post/3kvcdqj4vik2f  

The server returns object with ID https://bsky.brid.gy/convert/ap/at://did:plc:3ljmtyyjqcjee2kpewgsifvb/app.bsky.feed.post/3kvcdqj4vik2f, and its domain is different from the domain from which the object is being served.

Normally, if object is served from a different domain (origin), that should cause client to raise an authentication error, in order to prevent impersonation attacks. In this case only subdomain is different, but maybe this still should be treated as violation of same origin rule. What do you think?

In HTTP subdomains are treated as different origins: https://en.wikipedia.org/wiki/Same-origin_policy

snarfed commented 6 days ago

Hmm, thanks for looking at this! AP authorization is important, underappreciated, and obviously underspecified. 😕

A couple thoughts. First, you're right, fed.brid.gy does serve that object even though its id is on bsky.brid.gy. It might be better to 301 or 302 redirect to the bsky.brid.gy URL instead. I can consider that!

AP implementations also can and should liberally re-fetch objects and activities from their ids. In this case, if you fetch a fed.brid.gy URL via AP and it serves an object with id on bsky.brid.gy, re-fetching it from that bsky.brid.gy URL instead of trusting it is probably a good idea.

Otherwise, the naming overlap of browsers' same-origin policy and AP's "same origin" phrase is obviously unfortunate. Afaik the browser policy doesn't really apply to AP, and AP itself only barely mentions host/domain origins in passing, for Update and Delete, and even then only as "at minimum...may..."

Otherwise, my understanding is that AP isn't really host- or domain-centric. It doesn't require or expect that actors and their inboxes, activities, objects, etc are on the same origins. Regardless, the fediverse itself has converged on expecting this in many cases - notably, Mastodon requires both object id and (notably) url to be on the same host as their actor, or at least did back in 2018, which is why we originally added these fed.brid.gy/r/ wrapper URLs. 😕 Ideally, as a bridge service, the "correct" url would be the original post (object) on its original service! But no such luck. We had to wrap it for interop.

Anyway, back to authorization. Post working group and AP 1.0, the W3C (eg SWICG) has seemed to settle primarily on actor/object ownership for AP authorization. Specifically, from https://www.w3.org/wiki/ActivityPub/Primer/Authentication_Authorization#Authorization , Activities must be authenticated by their actor, via eg HTTP Sigs or LD Sigs, and Activities that create or modify an object - Create, Update, Delete, Undo, Move, etc - must be the same actor as the original object's attributedTo or activity's actor. I've tried to implement those checks carefully in BF, background in #566, but I've also tried to avoid any same-HTTP-origin heuristics, even though they are somewhat widespread in the fediverse, because AP itself doesn't require them and because they make bridge services like Bridgy Fed more difficult to run.

(Sorry for the wall of text! I know you probably know all this as well or better than me. 😁)

snarfed commented 6 days ago

Oh, btw, maybe a more concrete answer here is that BF should generally always use its real ids, eg on bsky.brid.gy for Bluesky users, in id fields in activities that it sends and objects it serves. fed.brid.gy/r/ URLs should hopefully only be in url, content, summary, and not much else beyond those, so AP implementations shouldn't need to fetch them. If you see fed.brid.gy/r/ in ids anywhere, please do let me know!

silverpill commented 6 days ago

Otherwise, the naming overlap of browsers' same-origin policy and AP's "same origin" phrase is obviously unfortunate. Afaik the browser policy doesn't really apply to AP, and AP itself only barely mentions host/domain origins in passing, for Update and Delete, and even then only as "at minimum...may..."

Same origin policy does apply to AP, the question is to what extent. Objects that are served from entirely different domain should never be accepted, otherwise system becomes vulnerable to cache poisoning / impersonation attack (I wrote a FEP about authentication/authorization that includes this guidance, and I'm currently trying to figure out what to do with subdomains).

The AP spec is incomplete in that regard (as well as the "Primer" page).

AP implementations also can and should liberally re-fetch objects and activities from their ids. In this case, if you fetch a fed.brid.gy URL via AP and it serves an object with id on bsky.brid.gy, re-fetching it from that bsky.brid.gy URL instead of trusting it is probably a good idea.

Yes, clients can make additional request after following the redirection chain, but that is not ideal. I think a redirect on the server side would be better.

snarfed commented 5 days ago

Objects that are served from entirely different domain should never be accepted, otherwise system becomes vulnerable to cache poisoning

Ah, good point! "Doesn't really apply" was maybe a bit of an overstatement; clients definitely shouldn't trust a fetched object if its id is on a different host.

That burden is primarily on clients, and these fed.brid.gy/r/ URLs shouldn't be AP ids anywhere, so hopefully this isn't causing problems in practice. Still though, I agree, we should ideally redirect these instead of serving them on fed.brid.gy. I'll do that!