mastodon / mastodon

Your self-hosted, globally interconnected microblogging community
https://joinmastodon.org
GNU Affero General Public License v3.0
47k stars 6.95k forks source link

Fetch link metadata on sender instance rather than receiver instances #12738

Open BenLubar opened 4 years ago

BenLubar commented 4 years ago

Pitch

Rather than fetching every link that comes in with a post, possibly causing a DDoS on a small target webserver, fetch the metadata on the sender side and attach it to the status object in some way.

Motivation

Even with the random delay, there's still a significant amount of load that can come from requests for page metadata for posts from users with lots of followers from different instances. Misleading metadata is a non-issue as it's already possible to create a redirect page with misleading information on it for Mastodon to consume.

trwnh commented 4 years ago

This would require federating preview information. Also it doesn't account for cases where the preview information changes -- and since Updates on statuses are discarded, this means that previews are effectively frozen at time of receipt.

Gargron commented 4 years ago

This would require federating preview information

To elaborate, it's not even a question of trust, but a question of needs as well. Do Mastodon, Pleroma, and Misskey agree on a specific interpretation of how to preview a link? Does that cover all possible use-cases? There's OpenGraph, OEmbed, Twitter-specific properties, who knows what else.

nightpool commented 4 years ago

i dunno, we could at least federate the properties mastodon wants as a Link attachment and then ask if anyone wants any additional properties. I see the og/oembed/twitter stuff as an implementation detail—I'm imagining high level properties like "image", "title" and "summary", not sending over all the low-level stuff we use to construct that.

i'm not too worried about trust—sites can make their previews whatever they want anyway.

trwnh commented 4 years ago

We could attach a Link with preview maybe? Per another example from some other discussion:

{
"@context": "https://www.w3.org/ns/activitystreams",
"id": "https://trwnh.com/objects/187639284398",
"type": "Article",
"content": "<p>Imagine I've written an entire article here.</p><p>You can find more at <a href="https://trwnh.com" rel="me">my site</a> if you're interested.</p>",
"tag":
[
  {
  "type": "Link",
  "name": "my site",
  "href": "https://trwnh.com",
  "rel": ["me"],
  "preview":
    {
    "type": "Page",
    "name": "$~trwnh",
    "summary": "Abdullah Tarawneh is a photographer, designer, and all-around creative. They are currently operating in Birmingham, AL. This is their personal landing page.",
    "icon": "https://trwnh.com/trwnh-192px.png"
    }
  }
]
}
raboof commented 2 years ago

I'm also running into this. I'm not too familiar with fediverse internals, but the proposal by @trwnh looks quite attractive to my untrained eye. (there's some more discussion in #4486 but I think this thread so far already sums it up fairly nicely :) )