Auto-generate page preview images based on page metadata

choldgraf commented 1 year ago

Many websites automatically generate a PNG "image preview" for social media sharing that contains the page title and some other page metadata. This uses the opengraph protocol for images, and so this functionality could be a useful part of this extension.

Proposal

Add the ability to optionally auto-generate an image preview for the page based on page metadata. It would display:

The site title
The site logo
The page title
Either a site tagline, hand-written page description, or the first several paragraphs of text.

In the future, it might also be able to display things like reading time or author ID, but this isn't built into Sphinx and is probably out of scope for a first PR.

Implementation

There's a prototype of this functionality already working at this repository:

https://github.com/choldgraf/sphinx-social-previews/blob/main/sphinx_social_previews/__init__.py

It uses matplotlib to generate a little image for each page, and then links the og:image meta tag to that preview image.

Design ideas

I don't think it'll be productive if we fall down a long design hole, but I wanted to propose this structure to see if anybody had any major concerns or preferences for a different style. It is heavily inspired by the GitHub image previews (see below):

In this case, most of the elements (including colors) could be configurable, and able to be turned off.

There's also a figma link here but I think I might be the only one that can edit?

Here are a few design ideas to use for inspiration:

Toggle previews

**GitHub** ![](https://user-images.githubusercontent.com/1839645/141599399-df5adffa-e3ea-4d3e-a147-2096bf446b2b.png) **MkDocs Material** ![](https://user-images.githubusercontent.com/1839645/189475311-95983743-a2ed-4a2b-a429-4931eb4daa82.png) **A personal blog from Hugo** ![](https://user-images.githubusercontent.com/1839645/189475336-a67792ed-60ab-4202-9a8b-91576fce6e2f.png)

TheTripleV commented 1 year ago

Thanks for making the issue.

On the Proposal

All of the required data in the proposal are already parsed and available in this repo. I think read time and author ID should be saved for later.

On read time, Google says 200wpm is a good estimate. But, I don't buy that for documentation. On author ID, I think it's hard to offer attribution for multi-author documents.

On the Implementation

I think using matplotlib is fine. Ideally, an html framework like Tailwind would be used. But, including node or chromium seems excessive so matplotlib will do.

On Integration

This feature should be behind a feature flag. I'd be open to making it on by default. The reason is that there is already (off-by-default) functionality to set og:image to the first image found on each page.

choldgraf commented 1 year ago

quick thought on implementation: I had also played around with using playwright.dev for this, along with some minimal HTML and CSS. But in the end decided to use matplotlib in order to simplify the installation environment and build process (running a headless chrome browser as part of Sphinx was creating some really weird error messages because of the async stuff they do). Maybe it'd be best to start with matplotlib and if a better alternative emerges, switching to that in the future

rkdarst commented 1 year ago

A little bit unrelated, but: a a middle ground of previews, the value of the Sphinx html_logo config option, if present, could be used as a less complex default, perhaps until this is ready, or as a simpler option if someone doesn't want the full features.

choldgraf commented 1 year ago

Do others think that this image preview looks OK?

To make it easier to iterate, I've tested out this functionality on my personal blog. That let me change the layout/design more quickly before making a PR here. I think it is nearly ready to be upstreamed, provided that others think that this design looks OK. Let me know what folks think:

It mostly re-uses the OGP content: site title, page title, description, site_url.

I'd like to avoid a big design discussion after the PR has been opened, and would prefer to agree on the layout, make / merge the PR, and then folks can suggest iterative changes after that. This is why I wanted to open discussion up here first :-)

Here's the code that generates this:

https://github.com/choldgraf/choldgraf.github.io/blob/main/social_previews/__init__.py

rkdarst commented 1 year ago

Looks good to me, but I'm not involved in this project. I can't think of anything else I immediately would want from this (my biggest priority is zero-configuration usage. In this case that would only be ogp_social_previews=True since this probably shouldn't be enabled by default, right?).

choldgraf commented 1 year ago

I've decided to move that code into the Sphinx extension repository, to allow others to install and try it out themselves. I've also opened this issue over there to track the process of upstreaming here:

https://github.com/choldgraf/sphinx-social-previews/issues/6

humitos commented 1 year ago

@choldgraf

I've tested out this functionality on my personal blog. That let me change the layout/design more quickly before making a PR here

I saw this the other day in your twitter account and I was impressed. It looks great to me and I'd love to give it a try in my docs.

choldgraf commented 1 year ago

@humitos - right now it's only installable via:

pip install git+https://github.com/choldgraf/sphinx-social-previews

Would it be useful to push it up to PyPI or is that OK to give feedback? I don't wanna create the expectation that the extension will be around forever which is why I haven't published it yet, but if that makes it easier to test out I think it's fine

humitos commented 1 year ago

That's enough, thanks!

Daltz333 commented 1 year ago

Hi, this is something we absolutely are interested in. The most important part is that we aren't back breaking anything for users, and if we are, we should be extremely loud about it.

The second issue of importance is that, on RTD platforms, 0 setup should be expected of users.

I haven't looked at the code, but the preview does look quite nice. @TheTripleV might have the time to do a thorough code review.

choldgraf commented 1 year ago

Happy to have a code review for implementation as well 👍

re: setup, the only configuration needed out of the box would be a feature flag to turn this on. I assumed that to begin we'd want it off by default, but maybe others disagree?

Other than that, the configuration would be just for customization (e.g. if people didn't want the description to show etc)

mrocklin commented 1 year ago

Just a general +1 on this functionality. I'd love to have it on by default. That would definitely encourage me to install sphinxext-opengraph into pretty much every sphinx project I controlled.

Daltz333 commented 1 year ago

Always on would probably be fine. Would you mind making a PR so review can begin?

choldgraf commented 1 year ago

Ok I've got enough positive feedback that I'll give this a shot. Though it may be a few days as i need to do this between my toddler and my day job 😅

TheTripleV commented 1 year ago

Google appears to be showing images next to results if they're available.

wpilibsuite / sphinxext-opengraph