misskey-dev / misskey

🌎 An interplanetary microblogging platform 🚀
https://misskey-hub.net/
GNU Affero General Public License v3.0
9.81k stars 1.32k forks source link

Wrong account names of users of Mastodon instances where WEB_DOMAIN != LOCAL_DOMAIN #7922

Open orithena opened 2 years ago

orithena commented 2 years ago

💡 Summary

This is from the outside perspective of a user of a Mastodon server that has its LOCAL_DOMAIN (the "host" part of the account name) set differently from the WEB_DOMAIN (where the web interface and API of that instance resides).

When interacting with such a Mastodon instance, Misskey seems to take the WEB_DOMAIN as part of the account name, while the LOCAL_DOMAIN would be correct.

To explain the issue, I'll use the Mastodon instance I am on, with the following data:

🙂 Expected Behavior (Example 1)

If someone on a Misskey instance answers to one of my posts, the answer is directed to @username@kif.rocks.

☚ī¸ Actual Behavior (Example 1)

If someone on a Misskey instance answers to one of my posts, the answer is directed to @username@toot.kif.rocks.

🙂 Expected Behavior (Example 2)

If I view the thread from Example 1 in the Misskey web interface (not logged in), I see my own post marked as written by @username@kif.rocks.

☚ī¸ Actual Behavior (Example 2)

If I view the thread from Example 1 in the Misskey web interface (not logged in), I see my own post marked as written by @username@toot.kif.rocks. (Tested with the Misskey instance mk.absturztau.be -- not sure about the version, but it currently loads "app.12.94.1.js" in the html head section.)

📝 Steps to Reproduce

  1. Be a Misskey user
  2. Interact with a Mastodon user on an instance where WEB_DOMAIN does not equal the LOCAL_DOMAIN.
  3. Pay attention to the Mastodon username you interact with.

One example of such a Mastodon instance would be kif.rocks, with its web interface and API at toot.kif.rocks. You may interact with me (I speak english and german); trying to answer one of my posts might suffice, but in case you need to follow me: @anathem@kif.rocks.

Additional Info

mei23 commented 2 years ago

This behavior is by design. There seems to be a problem with the design of this feature in Mastodon and it cannot be fixed.

Since the domain that appears in ActivityPub Object is toot.kif.rocks (WEB_DOMAIN), it seems unnatural to consider kif.rocks (LOCAL_DOMAIN) as Primary.

Mastodon may be hacking, but I don't want to hack or change the schema for features designed before Mastodon introduced ActivityPub.

mikekasprzak commented 2 years ago

I'll admit WEB_DOMAIN and LOCAL_DOMAIN aren't the best names. A different perspective would be to look at this like email. My email address might be mike@somedomain.com, but the actual mailserver my mail client uses is at mail.somedomain.com. Sending or receiving mail from mike@mail.somedomain.com would be undesirable.

If the issue is that ActivityPub lacks a means of name resolution, Matrix handles this in an elegant way using delegation.

https://github.com/matrix-org/synapse/blob/master/docs/delegate.md

I place a text file on my regular webserver (https://somedomain.com/.well-known/matrix/server). Matrix clients or federated servers can check that URL to find out the true location of the Matrix server (chat.somedomain.com).

Johann150 commented 2 years ago

I can only repeat what mei23 already said, maybe put a different way:

The documentation linked in #6724 shows that the way this works for mastodon is that the webfinger acct URLs are under a different domain than the actor IDs. However, Misskey does not keep track of the webfinger URLs but instead the ActivityPub actor URLs, which use WEB_DOMAIN. Therefore, Misskey assumes the host of the user is WEB_DOMAIN.

Let's use a random user from toot.kif.rocks as an example to explain how Mastodon's behaviour triggers this issue. Let's say we want to discover @anathem@kif.rocks. Misskey does the following steps that I'll trace with cURL.

show example 1. Use webfinger to discover from what URL the ActivityPub representation can be fetched. ``` $ curl -s https://kif.rocks/.well-known/webfinger?resource=acct:anathem@kif.rocks | python -m json.tool { "subject": "acct:anathem@kif.rocks", "aliases": [ "https://toot.kif.rocks/@anathem", "https://toot.kif.rocks/users/anathem" ], "links": [ { "rel": "http://webfinger.net/rel/profile-page", "type": "text/html", "href": "https://toot.kif.rocks/@anathem" }, { "rel": "self", "type": "application/activity+json", "href": "https://toot.kif.rocks/users/anathem" }, { "rel": "http://ostatus.org/schema/1.0/subscribe", "template": "https://toot.kif.rocks/authorize_interaction?uri={uri}" } ] } ``` 2. From this we learn that the Activitypub representation (`application/activity+json`) is at `https://toot.kif.rocks/users/anathem`. ``` $ curl -s -H "Accept: application/activity+json" https://toot.kif.rocks/users/anathem | python -m json.tool { // ... "id": "https://toot.kif.rocks/users/anathem", "type": "Person", // ... "url": "https://toot.kif.rocks/@anathem", "publicKey": { "id": "https://toot.kif.rocks/users/anathem#main-key", "owner": "https://toot.kif.rocks/users/anathem", // ... }, // ... } ``` I've truncated the result a bit, but nothing in this Activitypub representation suggests that the users host might be `kif.rocks`. All URLs in it start with `toot.kif.rocks`. Since this is the Acitivtypub representation and Misskey uses Activitypub, it uses the host presented here.

I support mei23's opinion that this is not a Misskey issue and should not be worked on.


Since Matrix is not part of Activitypub I also don't understand how that is relevant here. Misskey is also not in the position to single-handedly change the way Activitypub works, so this is not the right place to discuss how identity discovery should work.

mikekasprzak commented 2 years ago

Thank you, that helped. Webfinger was the part I missed. I mentioned Matrix only because I was aware of how they handled aliasing domains (which I could solve using a static file). I hadn't realized webfinger was a general service that could provide the aliasing functionality for ActivityPub.

ineffyble commented 1 year ago

The ability to host ActivityPub actors for a domain at a subdomain is a fairly useful and important one, and will become more so in future I suspect. If Misskey doesn't want to adopt the approach Mastodon has used, finding an alternative (maybe there needs to be a FEP?) that can be adopted broadly is a worthwhile endeavour.

trwnh commented 1 year ago

nothing in this Activitypub representation suggests that the users host might be kif.rocks. All URLs in it start with toot.kif.rocks. Since this is the Acitivtypub representation and Misskey uses Activitypub, it uses the host presented here.

actually, there is something that suggests kif.rocks instead of toot.kif.rocks: the WebFinger subject does! let's try a slightly different request:

GET https://toot.kif.rocks/.well-known/webfinger?resource=acct:anathem@toot.kif.rocks HTTP/1.1

{
  "subject": "acct:anathem@kif.rocks",
  "aliases": [
    "https://toot.kif.rocks/@anathem",
    "https://toot.kif.rocks/users/anathem"
  ],
  "links": [
    {
      "rel": "http://webfinger.net/rel/profile-page",
      "type": "text/html",
      "href": "https://toot.kif.rocks/@anathem"
    },
    {
      "rel": "self",
      "type": "application/activity+json",
      "href": "https://toot.kif.rocks/users/anathem"
    },
    {
      "rel": "http://ostatus.org/schema/1.0/subscribe",
      "template": "https://toot.kif.rocks/authorize_interaction?uri={uri}"
    }
  ]
}

note that we asked toot.kif.rocks and it instead returned a subject from kif.rocks.

so the expected flow is something like this on misskey's side:

  1. take username and host and construct an acct uri like acct:username@host (so take anathem and toot.kif.rocks to end up with acct:anathem@toot.kif.rocks
  2. make a webfinger request to host for that acct uri you just constructed (so take acct:anathem@toot.kif.rocks and do lookup against toot.kif.rocks)
  3. get the subject and use that as canonical uri (so take subject = acct:anathem@kif.rocks and store the acct anathem@kif.rocks for display purposes)

if i understood correctly, it seems misskey does not want to store this acct uri at all, and instead misskey assumes that it should always be equivalent to {username}@{host}... which it is not, so this assumption is technically wrong. there is no "hack" involved -- the canonical uri is considered to be the webfinger JRD subject as defined in RFC 7033. i suppose technically it would be better / more correct for mastodon to include acct:username@WEB_DOMAIN within the aliases array.

i think it would not be too weird for misskey to store acct on the User entity, and then use acct for rendering mentions in text and for auto-suggesting during text composition.

hikari-no-yume commented 1 year ago

Hi, I have the same issue, but I'm using GoToSocial instead of Mastodon.

It looks like GoToSocial achieves this with exactly the same WebFinger trick that Mastodon uses: https://docs.gotosocial.org/en/latest/installation_guide/advanced/#can-i-host-my-instance-at-fediexampleorg-but-have-just-exampleorg-in-my-username

So it's not an exclusively Mastodon issue.

saschanaz commented 1 year ago

I think the mismatch of the domains is just confusing. From the example in https://github.com/misskey-dev/misskey/issues/7922#issuecomment-1350544671, let's say someone sees @foo@kif.rocks and thinks the server might be interesting. They type the domain in the browser (instead of clicking "View on the other instance") to try signing up, and what they see is some unexpectedly different page which doesn't directly show any signup button. They feel frustrated and give up.

So I don't think Misskey should do whatever Mastodon currently does, to prevent this kind of confusion.

hikari-no-yume commented 1 year ago

This issue isn't about whether Misskey should support configuring an instance that way, though, just whether it should respect other instances that do it.

And for what it's worth, most instances that do this have an obvious link to their instance if you visit their site, though with that said, many instances are private and don't need to accept public sign-ups anyway.

Johann150 commented 1 year ago

I want to make explicitly clear that I am no longer part of the Misskey project and the following is my own opinion.

i think it would not be too weird for misskey to store acct on the User entity

Since Misskey already stores both the username and the host, the solution would more likely be to "just" store the other host in the user.host field in the database.

  1. get the subject and use that as canonical uri

You forgot to mention (or maybe did not consider) that this requires additional checks/redirects to be made to verify that (in the example) kif.rocks agrees with this use of its authority.

Otherwise I could have...

(show example) `https://example.com/.well-known/webfinger?resource=acct:eve@example.com` result in ```json { "subject": "acct:Gargron@mastodon.social", "aliases": [ "https://example.com/@eve", "https://example.com/users/eve" ], "links": [ { "rel": "http://webfinger.net/rel/profile-page", "type": "text/html", "href": "https://example.com/@eve" }, { "rel": "self", "type": "application/activity+json", "href": "https://example.com/users/eve" } ] } ```

and the server could now act on behalf of gargron@mastodon.social, or any arbitrary user on any arbitrary domain for that matter, even nonexistent ones.

The least that I think would have to be checked is that the "original" webfinger as well as the webfinger from the subjects host agree on the subject and href for the ActivityPub link.

Or are you proposing that different users can have the same acct?

further non-technical rant > there is no "hack" involved In my opinion, webfinger, and even more this use of the`subject` field *is* the hack. I don't understand why it is necessary that a user should have a different domain name displayed than they are hosted on, since the web domain will have to exist anyway. Especially since there are several fedi servers that have names like `social.acme.example` and they apparently haven't died yet. As such I think this configuration option should not even exist in the first place and I doubt whether it should be accomodated. But of course there is the elephant in the room and as soon as they do something, it is considered correct.
trwnh commented 1 year ago

re: the opinion

the solution would more likely be to "just" store the other host in the user.host field in the database.

this would be more incorrect, no? the host is still toot.kif.rocks; it's just that the address is canonical to kif.rocks. although i suppose for the purposes of webfinger, this might not matter and you might not care what the activitypub host is, as long as you can get to it via webfinger. but i'd be wary of depending on webfinger, so i think it still makes sense to store webfinger acct/subject separately from the host.

The least that I think would have to be checked is that the "original" webfinger as well as the webfinger from the subjects host agree on the subject and href for the ActivityPub link.

sure

Or are you proposing that different users can have the same acct?

this would be interesting but no. a given webfinger server should have only one canonical JRD per unique URI, so querying for the acct resource should return the canonical JRD for that resource. however, note that you may be querying for an alias. this is up to the webfinger server to respond with what it considers to be the authoritative response.

re: the rant

In my opinion, webfinger, and even more this use of thesubject field is the hack. I don't understand why it is necessary that a user should have a different domain name displayed than they are hosted on, since the web domain will have to exist anyway.

webfinger is only useful insofar as you care about having some user@domain representation and being able to find the activitypub id given only that. it's not a hack, it's just how webfinger works. it's not even weird; generally, resources have canonical URIs, that's normal. also, for prior art you can consider mail servers -- do you want to force everyone to strictly use @mail.domain.tld instead of the bare @domain.tld, just because the mailserver is running on a different host and has a different hostname? i'd think not. ultimately the way that this problem is solved in the email world is with MX records, and in xmpp land they use SRV records, but the fediverse has (for better or worse) decided to forego DNS and settle on webfinger instead (since, y'know, web.) i'm not entirely sure it would be better to use DNS in this regard, since the webfinger rfc explicitly supports redirects.

per https://www.rfc-editor.org/rfc/rfc7033#section-7 :

By way of example, a domain owner might control most aspects of their domain but use a third-party hosting service for email. In the case of email, mail exchange (MX) records identify mail servers for a domain. An MX record points to the mail server to which mail for the domain should be delivered. To the sending server, it does not matter whether those MX records point to a server in the destination domain or a different domain. [...] Just as a domain owner is required to insert MX records into DNS to allow for hosted email services, the domain owner is required to redirect HTTP queries to its domain to allow for hosted WebFinger services.

basically what is happening here is that toot.kif.rocks is saying "hey you asked for this resource, here's what i've got on it, this is the canonical subject and its links/properties/etc" and you can then check that the canonical subject leads back to the same resource. so it's kind of like a reverse redirect. like, you load a web page or document via some redirect or proxy, and there's a link tag with rel=canonical on it. do you trust it? you'd probably end up doing the same verification. you can also apply the same logic to how rel-me works, and establishing a two-way backlink. it's not unusual.

Johann150 commented 1 year ago

the solution would more likely be to "just" store the other host in the user.host field in the database.

this would be more incorrect, no? the host is still toot.kif.rocks; it's just that the address is canonical to kif.rocks.

The user.uri, user.inbox and user.sharedInbox fields continue to contain full canonical IRIs that would still point to the id, inbox or sharedInbox || endpoints.sharedInbox from the ActivityPub representation respectively.

resources have canonical URIs, that's normal.

sure, the part that i have issue with though is that the ActivityPub canonical IRI (i.e. id / @id) is "incompatible" with the webfinger IRI (does not have the same authority section/hostname). If you want to be "at kif.rocks" IMO your canonical id IRIs should use that authority. I'm wrong anyway, no need to discuss this further. (That's why I put it in a collapsible in the first place.)

tesaguri commented 1 month ago

The reverse discovery process of the (canonical) WebFinger username from an ActivityPub actor as practiced by Mastodon et al. has now been formalized as part of the ActivityPub and WebFinger community group report (shout-out to trwnh for authoring it, by the way!), so I think the convention is now more official-ish than before.