friendica / friendica

Friendica Communications Platform
https://friendi.ca
GNU Affero General Public License v3.0
1.45k stars 341 forks source link

Cleaning up OEmbed and OpenGraph stuff in the BBCode parser #4222

Open annando opened 6 years ago

annando commented 6 years ago

Currently the BBCode parser is doing some checks with OEmbed and OpenGraph of some links. But it doesn't seem to be really straight forward.

I would suggest some cleaning up: OEmbed should only be done with rich content on videos, [bookmark] and [attachment] elements. Every other link should be untouched.

Without active OEmbed, (or when OEmbed hadn't returned rich content) the OpenGraph check should only be done on videos and the [bookmark] element - but not the [attachment] - or any other link. The [attachment] should be parsed with the given data.

MrPetovan commented 6 years ago

Oembed provides more than a HTML snippet, and most of its data overlaps with what's provided by OpenGraph. Since we're trying OEmbed anyway, we might just use whatever metadata we get from the response if any.

Here's my idea:

We are doing all of the above, except in a very inorderly fashion. I just discovered that ParseUrl::getSiteinfo() does almost exactly that, including an OEmbed call, except it's also called when OEmbed fails anywhere else 🤯

My suggestions:

Why don't you want to stylize [attachment] tags when OEmbed isn't available? It's just one metadata provider among others?

annando commented 6 years ago

[attachment] already contains fetched metadata - so why should we fetch this again? This only wastes bandwith and time. The only exception I see there is when we get rich content. Additionally maybe the author had beautified the transmitted metadata (different preview picture, enhanced link description, ...) so why should we overwrite this?

For [url] I see it differently. If the author had wanted that metadata should be fetched, he (or she) could have used [attachment] or [bookmark]. The only exception I see here is when we receive content from other networks like Diaspora. Since there the authors can't decide which links should get metadata (and which not), we should fetch the metadata for the first link. But this we are already doing in the protocol implementation - this is no BBCode parser thing.

So - to not interfere with the author's intention - we only should fetch metadata in the parser when it is a clear enhancement (like rich data).

MrPetovan commented 6 years ago

I get it, and this seems like a huge departure from what we're doing now!

annando commented 6 years ago

I don't think that users will recognize the difference at all.

MrPetovan commented 6 years ago

I don’t think so either, I’m just amazed at how much the code base has diverged from your simple explanation.