superseriousbusiness / gotosocial

Fast, fun, small ActivityPub server.
https://docs.gotosocial.org
GNU Affero General Public License v3.0
3.81k stars 331 forks source link

[chore] Allow custom emojis to be anonymous objects (without ID/URI) #3384

Open TheOneric opened 1 month ago

TheOneric commented 1 month ago

This concerns interoperability with a planned but not yet merged change in Akkoma (as well as a some other setups), so it might be best to first explain why this change is considered at all, if it’s too long skip the first paragraph.

Currently *oma does not track remote emoji at all, moreover even local emoji aren’t tracked in the database but rather on-disk configuration files which are reloaded as necessary. As a consequence *oma can't provide a meaningful value for e.g. emoji change times (which in theory some software may use for caching purposes) and stubs it out. There’s also no real AP ID or object; emoji can’t be dereferenced on their own. Instead *oma always used to just put the image URL into the id field (blatantly breaking AP requirements, hint hint)*. In theory it would be possible to provide AP objects for currently loaded, local emoji based on their shortcode with things being stubbed out just like when federated as part of post, but this doesn't work for remote emoji and using remote emoji is allowed and appreciated by users, even when the lack of tracking means it’s only possible when the remote emoji already exists in the current context, i.e.: adding an emoji reaction to a post which already has the desired remote emoji reaction. Furthermore Pleroma allows users to use any image as an emoji with an shortcode of their own choosing via C2S API, it’s not feasible to track and provide AP objects for those.

However as mentioned before the current mangling of id blatantly conflicts with AP semantics, requiring all existing, non-null IDs to be publicly dereferencable and to lead to the same object (or in case of fragment IDs as used in practice, something containing this object). Predictably, although it took surprisingly long, where using image URLs as ID broke other software. But actually providing AP objects for everything is not feasible.

AP spec includes the concept of “anonymous objects” denoting objects which only exist within their parent context and whose id must exist but be explicitly null (setting them apart from transient objects which do not have an id field at all). This perfectly matches what custom C2S emoji are and also goes well with how local and remote emoji are (not) organised in *oma atm making it a natural fit for federation purposes.

Mastodon docs about its emoji extenson also never mention id at all outside of it occurring in an illustrating example, much less specific requirements stricter than APs general ones.

Federating emoji as anonymous objects works well with other *oma instances, since id is never checked at all. It also works well with Mastodon and with key the latter of which already had code to explicitly handle the case of null-id emoji. (This may indicate some other software i don’t know about also already federates anonymous-object emoji)* Furthermore iceshrimp.net adopted the use of null-id emoji since, but only when using remote emoji since it properly tracks local emoji and doesn't allow for fully custom, user-controlled emoji.

However in testing with a GtS 0.16.0 all emoji ended up being stripped instead only displaying as their shortcode text. Since I’m not operating any GtS instance I can’t say what exactly went wrong, though i noticed your db scheme for emoji requires URIs to be notnull which may be related, but I don’t know if just dropping this requirement is sufficient or safe to do without breaking something else. Alternatively if you don’t want to treat null-id emoji, which may be user-controlled and a one-time occurrence, as emoji at all for DB purposes it may also be “good enough” to transform them into inline images with an appropriate alt and title text.

I believe anonymous objects are the best option for making *oma emoji AP-compliant fixing current interop issues and the path introducing the least amount of new breakages in recipients. It would be great if GtS could learn to understand emoji federated as anonymous objects. This might also fix not-yet reported issues with iceshrimp.net and whatever brought Misskey’s null-id handling about.

tsmethurst commented 1 month ago

Thanks for opening. Could you link to some docs about anonymous objects, if you have them handy? Then we can take a look and see if we can make some changes (though it won't be for any time soon, probably, we've got a lot of other stuff on our plate rn).

tsmethurst commented 1 month ago

Oh nvm, I see it's mentioned here: https://www.w3.org/TR/activitypub/#obj-id

tsmethurst commented 1 month ago

Ah, so in essence this would be similar to how most Hashtags work currently, where the Hashtag type is inside the tag value on a note or whatever?

TheOneric commented 1 month ago

Yep, base AP spec is the right place.

In practice handling anonymous emoji will likely be similar to hashtags which commonly don’t have an id field at all, yeah. Both cannot (necessarily) be fetched on their own and all relevant data is directly accessible in the embedded form.

Nitpicking technical details there’s a difference: hashtags are a subtype of Link and Links are not objects. Thus they are not required to ever provide an id field and commonly don't. Emoji are a subtype of object and thus must follow the requirements from section 3.1 linked above. “Transient” doesn't seem right for embedded emoji, but “part of the post” seems fitting; thus they’ll have an id field but explicitly set to null.
But when receiving them this difference only matters if accessing non-existent fields doesn’t automatically return a default null value anyway.

TheOneric commented 1 week ago

“part of the post” seems fitting; thus they’ll have an id field but explicitly set to null.

As it turns out, while ActivityPub spec requires an explicit null id for anonymous objects, JSON-LD requirements for @id forbid an explicit null making the two specs incompatible. This is being discussed in https://github.com/w3c/activitypub/issues/476 and atm the most likely outcome appears for AP spec to be amended to fully omit id for anonymous objects. Akkoma’s implementation will be changed to match the out come of this.

Also a correction: IceShrimp.NET actually already fully omits the id field instead of federating an explicit null (its framework removes null fields before publishing).

Considering the old wording was around a long time, i’d recommend to treat both explicit null and missing id fields the same.