Automattic / jetpack

Security, performance, marketing, and design tools — Jetpack is made by WordPress experts to make WP sites safer and faster, and help you grow your traffic.
https://jetpack.com/
Other
1.59k stars 799 forks source link

Social Previews: Improve the heuristic for determining preview content #16876

Open marekhrabe opened 4 years ago

marekhrabe commented 4 years ago

Calypso SEO preview uses several more advanced ways to gather possible preview content from the post. With the port to Jetpack under the name Social Previews, we have started with the basic implementation - using title, perex, post content and a featured image.

We should see how we can improve this. To be sure, the previews are as close as possible to the real outcome, we need to check how does WP with our plugins generate meta tags, which are used as a source for the previews in external services.

It's worth noting these meta tags could be generated and altered by different plugins (I imagine Jetpack and Yoast could both touch this).

I imagine a few ways to do this:

jeherve commented 4 years ago

We can see what Jetpack does under the hood to generate meta tags (or is it just pure WP feature?) and match it perfectly, since our previews live in Jetpack too.

Here is where most of the logic lives:

That said, and as you mentioned, those tags can change a lot depending on your setup:

So that second approach may be better.

marekhrabe commented 4 years ago

Another thing coming up from the testing: when falling back to generate the preview text from the post content (happens when the perex is missing), we should look for the more block and cut the text there. Addressed in: #16889

Fallback description is "Visit the post for more.", let's implement it.

https://github.com/Automattic/jetpack/blob/f9b5cdb92ffc1b8753db7f99b1697090bd886aa4/functions.opengraph.php#L508

marekhrabe commented 4 years ago

Any idea how to achieve getting just the meta tags, @jeherve? We cannot really make a new API endpoint because plugins hook into those normal WP actions like wp_head and expect to be in a global context of handling a singular post for example.

Is there a way we can add a param when loading a post (requiring a user + edit post capability) which only runs and returns wp_head and nothing else? That would likely give us all we need and we would parse it from there.

Alternative I guess we can request the full post HTML and parse it from there. It's just a lot of extra data on the wire and I'm not sure we could request the full page using XHR.

jeherve commented 4 years ago

Related discussion: p1597830589407000-slack-CJS75TX3R

jeherve commented 3 years ago

Also reported in p8oabR-DS-p2#comment-5184