Cleaning up OEmbed and OpenGraph stuff in the BBCode parser

annando commented 6 years ago

Currently the BBCode parser is doing some checks with OEmbed and OpenGraph of some links. But it doesn't seem to be really straight forward.

I would suggest some cleaning up: OEmbed should only be done with rich content on videos, [bookmark] and [attachment] elements. Every other link should be untouched.

Without active OEmbed, (or when OEmbed hadn't returned rich content) the OpenGraph check should only be done on videos and the [bookmark] element - but not the [attachment] - or any other link. The [attachment] should be parsed with the given data.

MrPetovan commented 6 years ago

Oembed provides more than a HTML snippet, and most of its data overlaps with what's provided by OpenGraph. Since we're trying OEmbed anyway, we might just use whatever metadata we get from the response if any.

Here's my idea:

For every [attachment] if any and for the first [url] if none, fetch the provided URL
If OEmbed is available and doesn't error out, use OEmbed metadata (including html if settings allow)\
Else use OpenGraph metadata if available
Else use the [attachment] attributes

We are doing all of the above, except in a very inorderly fashion. I just discovered that ParseUrl::getSiteinfo() does almost exactly that, including an OEmbed call, except it's also called when OEmbed fails anywhere else 🤯

My suggestions:

use ParseUrl::getSiteinfo() anywhere we currently use OEmbed::fetchURL()
use a single method for attachment HTML output that will use the result from ParseUrl::getSiteinfo()

Why don't you want to stylize [attachment] tags when OEmbed isn't available? It's just one metadata provider among others?

annando commented 6 years ago

[attachment] already contains fetched metadata - so why should we fetch this again? This only wastes bandwith and time. The only exception I see there is when we get rich content. Additionally maybe the author had beautified the transmitted metadata (different preview picture, enhanced link description, ...) so why should we overwrite this?

For [url] I see it differently. If the author had wanted that metadata should be fetched, he (or she) could have used [attachment] or [bookmark]. The only exception I see here is when we receive content from other networks like Diaspora. Since there the authors can't decide which links should get metadata (and which not), we should fetch the metadata for the first link. But this we are already doing in the protocol implementation - this is no BBCode parser thing.

So - to not interfere with the author's intention - we only should fetch metadata in the parser when it is a clear enhancement (like rich data).

MrPetovan commented 6 years ago

I get it, and this seems like a huge departure from what we're doing now!

annando commented 6 years ago

I don't think that users will recognize the difference at all.

MrPetovan commented 6 years ago

I don’t think so either, I’m just amazed at how much the code base has diverged from your simple explanation.

friendica / friendica

Cleaning up OEmbed and OpenGraph stuff in the BBCode parser #4222