Open genehack opened 3 weeks ago
One of the primary and original features of Markdown is that you can inline HTML at-will to do stuff it doesn't have simplified syntax for. Does that not Just Work here?
One of the primary and original features of Markdown is that you can inline HTML at-will to do stuff it doesn't have simplified syntax for. Does that not Just Work here?
That is in fact how I generated the version I took the screenshot of! (After updating the list of tags that sanitize-html allows...)
The reason why I think we would want to update to handling this within the Markdown parser is that the ergonomics of !["Fig. 1"](/images/whatever)
— especially for less technical authors — are much better than <figure><a href="blah blah"><img src="foo" etc etc etc
. (You also get into oddities with Markdown not being respected inside inline HTML elements, so you can't use ** for strong inside the <figure>
markup, you have to use <strong>
, etc.
the general notion of trying to set the figures and captions off from the surrounding text with a border
Yes - i'm in favour of this. It's how a lot of journals present figures in HTML. Horizontal lines above/below (as papers are often rendered in PDF form) may also be an option:
preferences between the "img border" versus "figure border" approaches
(for me) it has to include the caption.
extending marked versus switching to a different Markdown parser
Seems like a lot of effort for not a huge reward if we can just use HTML? Maybe in the long run it'd pay off, but for now I'd suggest the opportunity cost could be too high. On the authorship side, I'd imagine many new blog posts could copy-paste HTML from previous blog posts.
Hmm. I'm not convinced by the ergonomics argument. I think the modicum of HTML required for <figure>
is eminently a) teachable and b) copy-and-paste-able. I agree with @jameshadfield's point.
While Markdown's inline syntax isn't parsed within HTML blocks, you can surround blocks of Markdown with blocks of HTML and it'll Just Work (intentionally so by design), e.g.
*hello*
<figure>
<img src="image.png">
<figcaption>
*hello*
**figure**
</figcaption>
</figure>
**world**
Hmm. I'm not convinced by the ergonomics argument.
You're not a grad student rotating through the lab who's been asked to write a blog post and never touched HTML before. Here's a concrete example from a recent blog post to illustrate the ergonomic issue I'm talking about:
[![fig1](img/oropouche_host_view.png)](https://nextstrain.org/oropouche/L?c=host)
**Figure 1. Time-resolved phylogeny for the L segment colored by the host the
sample was acquired from.** 98% of sequences in our sample are human, so we have
very limited information into the host reservoir dynamics. Available as
[nextstrain.org/oropouche/L?c=host](https://nextstrain.org/oropouche/L?c=host).
<figure>
<figure>
<a href="https://nextstrain.org/oropouche/L?c=host" target="_blank">
<img src="img/oropouche_host_view.png" alt="fig1" />
</a>
<figcaption>
**Figure 1. Time-resolved phylogeny for the L segment colored by the host the
sample was acquired from.** 98% of sequences in our sample are human, so we have
very limited information into the host reservoir dynamics. Available as
[nextstrain.org/oropouche/L?c=host](https://nextstrain.org/oropouche/L?c=host).
</figcaption>
</figure>
...and those two blank lines are load-bearing. I dunno, to me it seems like this is gonna be pretty frustrating; maybe I'm underestimating the general level of experience.
Using a template-based system that allows the definition of shortcodes or macros, would look something like:
{% figure "img/oropouche_host_view.png" "fig1" "https://nextstrain.org/oropouche/L?c=host" %}
**Figure 1. Time-resolved phylogeny for the L segment colored by the host the
sample was acquired from.** 98% of sequences in our sample are human, so we have
very limited information into the host reservoir dynamics. Available as
[nextstrain.org/oropouche/L?c=host](https://nextstrain.org/oropouche/L?c=host).
{% endfigure %}
You're not a grad student rotating through the lab who's been asked to write a blog post and never touched HTML before.
Right, and I'm not thinking about the ergonomics of it given my experience. That's why I said I think it's teachable and copy-and-pasteable. We wouldn't abandon such a grad student to figure it out on their own and they're already likely going to be copying-and-pasting the current Markdown version of it anyway so doing it for the HTML version doesn't seem like much of a stretch.
I'd also say the more apples-to-apples comparison is this <figure>
version:
<figure><a href="https://nextstrain.org/oropouche/L?c=host"><img src="img/oropouche_host_view.png" alt="fig1"></a><figcaption>
**Figure 1. Time-resolved phylogeny for the L segment colored by the host the
sample was acquired from.** 98% of sequences in our sample are human, so we have
very limited information into the host reservoir dynamics. Available as
[nextstrain.org/oropouche/L?c=host](https://nextstrain.org/oropouche/L?c=host).
</figcaption></figure>
Is there more syntax? Yes, but not much and it's spelling out things that otherwise opaque (equivalent to named vs. unnamed functions args).
Are the two blank lines load-bearing? Yes, and that's unfortunate, but one of the big benefits of using plain Markdown without templating is that it still renders ~correctly in many renderers, e.g. inside someone's editor. That will help make it clear something's wrong when those lines are omitted.
I think it makes sense to handle on the parser side if:
<a href="…" target="_blank">
!["Fig. 1"](/images/whatever)
syntax seems better, but unclear where the caption is defined. Is it "Fig. "
? Isn't that the alt text?{% figure %}
syntax seems simpler, but maybe not significantly as @tsibley mentions.
Currently, blog post images are rendered with this HTML markup:
which renders like so:
To my eye, this results in the caption visually "bleeding" into the following paragraph of the post content.
In porting the blog pages to App Router, I experimented with adding a border to the image, to see if that would resolve the "bleeding" issue. My initial attempt was just adding
margin: 1px solid #ccc; padding: 1rem;
to the CSS forimg
tags in blog posts. That looked like this:This helps offset the figure image, but because the caption text is outside the box, the "bleeding" problem is still present.
Thinking about this a little bit more, I realized that, by using some more semantic markup, it would be possible to have the caption inside the box, giving a more distinct delineation of the "figure" content from the surrounding "article" content. The markup would look like:
and then applying the above CSS rule (with
border
andpadding
to the<figure>
tag rather than<img>
, which looks like this:Overall, I think this looks very nice, and is reminiscent of figure rendering in a journal article, which feels appropriate for our blog.
Actually implementing this in an ergonomic way would require either extending the
marked
Markdown parser to recognize some sort of custom markdown syntax, or to interpret the normal![]
image/link syntax so the<figure>
markup was emitted, or switching to a different Markdown parser with more direct support for including<figure>
instead of<img>
, such as markdown-it/markdown-it-figure.I'm soliciting feedback on all of:
marked
versus switching to a different Markdown parser