[blog][proposal] Adjust styling/markup for blog post images to better delinate figure and caption from surrounding text

genehack commented 3 weeks ago

Currently, blog post images are rendered with this HTML markup:

<p>
  <a href="">
    <img src="IMG_SRC" alt="Fig 1" />
  </a>
  <strong>FIGURE TITLE</strong>
  FIGURE CAPTION
</p>

which renders like so:

To my eye, this results in the caption visually "bleeding" into the following paragraph of the post content.

In porting the blog pages to App Router, I experimented with adding a border to the image, to see if that would resolve the "bleeding" issue. My initial attempt was just adding margin: 1px solid #ccc; padding: 1rem; to the CSS for img tags in blog posts. That looked like this:

This helps offset the figure image, but because the caption text is outside the box, the "bleeding" problem is still present.

Thinking about this a little bit more, I realized that, by using some more semantic markup, it would be possible to have the caption inside the box, giving a more distinct delineation of the "figure" content from the surrounding "article" content. The markup would look like:

<figure>
  <a href="">
    <img src="IMG SRC" alt="Fig. 1" />
  </a>
  <figcaption>
    <strong>FIGURE TITLE</strong>
    FIGURE CAPTION
  </figcaption>
</figure>

and then applying the above CSS rule (with border and padding to the <figure> tag rather than <img>, which looks like this:

Overall, I think this looks very nice, and is reminiscent of figure rendering in a journal article, which feels appropriate for our blog.

Actually implementing this in an ergonomic way would require either extending the marked Markdown parser to recognize some sort of custom markdown syntax, or to interpret the normal ![] image/link syntax so the <figure> markup was emitted, or switching to a different Markdown parser with more direct support for including <figure> instead of <img>, such as markdown-it/markdown-it-figure.

I'm soliciting feedback on all of:

the general notion of trying to set the figures and captions off from the surrounding text with a border
preferences between the "img border" versus "figure border" approaches
extending marked versus switching to a different Markdown parser

tsibley commented 3 weeks ago

One of the primary and original features of Markdown is that you can inline HTML at-will to do stuff it doesn't have simplified syntax for. Does that not Just Work here?

genehack commented 3 weeks ago

One of the primary and original features of Markdown is that you can inline HTML at-will to do stuff it doesn't have simplified syntax for. Does that not Just Work here?

That is in fact how I generated the version I took the screenshot of! (After updating the list of tags that sanitize-html allows...)

The reason why I think we would want to update to handling this within the Markdown parser is that the ergonomics of !["Fig. 1"](/images/whatever) — especially for less technical authors — are much better than <figure><a href="blah blah"><img src="foo" etc etc etc. (You also get into oddities with Markdown not being respected inside inline HTML elements, so you can't use ** for strong inside the <figure> markup, you have to use <strong>, etc.

jameshadfield commented 2 weeks ago

the general notion of trying to set the figures and captions off from the surrounding text with a border

Yes - i'm in favour of this. It's how a lot of journals present figures in HTML. Horizontal lines above/below (as papers are often rendered in PDF form) may also be an option:

preferences between the "img border" versus "figure border" approaches

(for me) it has to include the caption.

extending marked versus switching to a different Markdown parser

Seems like a lot of effort for not a huge reward if we can just use HTML? Maybe in the long run it'd pay off, but for now I'd suggest the opportunity cost could be too high. On the authorship side, I'd imagine many new blog posts could copy-paste HTML from previous blog posts.

tsibley commented 2 weeks ago

Hmm. I'm not convinced by the ergonomics argument. I think the modicum of HTML required for <figure> is eminently a) teachable and b) copy-and-paste-able. I agree with @jameshadfield's point.

While Markdown's inline syntax isn't parsed within HTML blocks, you can surround blocks of Markdown with blocks of HTML and it'll Just Work (intentionally so by design), e.g.

*hello*

<figure>
  <img src="image.png">
  <figcaption>

*hello*
**figure**

  </figcaption>
</figure>

**world**

genehack commented 2 weeks ago

Hmm. I'm not convinced by the ergonomics argument.

You're not a grad student rotating through the lab who's been asked to write a blog post and never touched HTML before. Here's a concrete example from a recent blog post to illustrate the ergonomic issue I'm talking about:

Current:

[![fig1](img/oropouche_host_view.png)](https://nextstrain.org/oropouche/L?c=host)
**Figure 1.  Time-resolved phylogeny for the L segment colored by the host the 
sample was acquired from.** 98% of sequences in our sample are human, so we have 
very limited information into the host reservoir dynamics. Available as 
[nextstrain.org/oropouche/L?c=host](https://nextstrain.org/oropouche/L?c=host).

Using `<figure>`

<figure>
  <a href="https://nextstrain.org/oropouche/L?c=host" target="_blank">
    <img src="img/oropouche_host_view.png" alt="fig1" />
  </a>
  <figcaption>

**Figure 1.  Time-resolved phylogeny for the L segment colored by the host the
sample was acquired from.** 98% of sequences in our sample are human, so we have 
very limited information into the host reservoir dynamics. Available as 
[nextstrain.org/oropouche/L?c=host](https://nextstrain.org/oropouche/L?c=host).

  </figcaption>
</figure>

...and those two blank lines are load-bearing. I dunno, to me it seems like this is gonna be pretty frustrating; maybe I'm underestimating the general level of experience.

Using a template-based system that allows the definition of shortcodes or macros, would look something like:

{% figure "img/oropouche_host_view.png" "fig1" "https://nextstrain.org/oropouche/L?c=host" %}
**Figure 1.  Time-resolved phylogeny for the L segment colored by the host the
sample was acquired from.** 98% of sequences in our sample are human, so we have 
very limited information into the host reservoir dynamics. Available as 
[nextstrain.org/oropouche/L?c=host](https://nextstrain.org/oropouche/L?c=host).
{% endfigure %}

tsibley commented 2 weeks ago

You're not a grad student rotating through the lab who's been asked to write a blog post and never touched HTML before.

Right, and I'm not thinking about the ergonomics of it given my experience. That's why I said I think it's teachable and copy-and-pasteable. We wouldn't abandon such a grad student to figure it out on their own and they're already likely going to be copying-and-pasting the current Markdown version of it anyway so doing it for the HTML version doesn't seem like much of a stretch.

tsibley commented 2 weeks ago

I'd also say the more apples-to-apples comparison is this <figure> version:

<figure><a href="https://nextstrain.org/oropouche/L?c=host"><img src="img/oropouche_host_view.png" alt="fig1"></a><figcaption>

**Figure 1.  Time-resolved phylogeny for the L segment colored by the host the
sample was acquired from.** 98% of sequences in our sample are human, so we have 
very limited information into the host reservoir dynamics. Available as 
[nextstrain.org/oropouche/L?c=host](https://nextstrain.org/oropouche/L?c=host).

</figcaption></figure>

Is there more syntax? Yes, but not much and it's spelling out things that otherwise opaque (equivalent to named vs. unnamed functions args).

Are the two blank lines load-bearing? Yes, and that's unfortunate, but one of the big benefits of using plain Markdown without templating is that it still renders ~correctly in many renderers, e.g. inside someone's editor. That will help make it clear something's wrong when those lines are omitted.

victorlin commented 2 weeks ago

I think it makes sense to handle on the parser side if:

We want this styling to be applied to all figures across blog posts
We want to standardize things that would be in the HTML such as <a href="…" target="_blank">
It's easy to implement this in the markdown parser (I haven't looked, maybe it's difficult with lots of edge cases to consider?)
The syntax is significantly simpler than the existing HTML alternative
- The !["Fig. 1"](/images/whatever) syntax seems better, but unclear where the caption is defined. Is it "Fig. "? Isn't that the alt text?
- The {% figure %} syntax seems simpler, but maybe not significantly as @tsibley mentions.

nextstrain / nextstrain.org