gohugoio / hugo

The world’s fastest framework for building websites.
https://gohugo.io
Apache License 2.0
73.49k stars 7.37k forks source link

Accommodate recent change to Goldmark's strikethrough extension #12597

Closed jmooring closed 1 week ago

jmooring commented 2 weeks ago

https://github.com/yuin/goldmark/pull/455

Prior to this change in v1.7.2, Goldmark's strikethrough extension was triggered by wrapping text within a pair of double-tildes:

~~deleted~~

This behavior was compatible with Hugo's extras extension, so you could use strikethrough and subscripts at the same time:

~~deleted~~
H~2~O

Goldmark's strikethrough extension is now triggered by either single- or double-tildes:

~~deleted~~
~deleted~

That means you can no longer use use strikethrough and subscripts at the same time.

Possible solutions:

  1. Look at prioritization, though initial review by @bowman2001 was not promising
  2. Implement a double-tilde strikethrough feature in the extras extension, and use this feature instead of Goldmark's strikethrough extension when subscripts are enabled (or some similar mechanism)

Regardless of the subscript conflict, the Goldmark change is a breaking change.

bowman2001 commented 2 weeks ago

About the suggested prioritization: There already was a unit test case to check the compatibility and coexistence of subscript and strike-through for an expression with mixed usage. And I just added another one with a simple strike-through. For now, they are both fine.

When I raise the priority of subscript above strike-through (locally), they both fail (as expected).

jmooring commented 2 weeks ago

For now, they are both fine.

Please clarify what this means.

bowman2001 commented 2 weeks ago

With the current prioritization, both tests pass.

They will fail if the priority of subscript is raised above strike-through. (Tested locally)

bowman2001 commented 2 weeks ago

As far as I can see, the only option is our own strike-through implementation. I think, this extension should be able to process both kinds of usages of the ~ like Goldmark handles both types of emphasis. This would be the most efficient way.

And because you already implemented a detailed configuration structure, it would also be possible to accommodate different needs with an additional config parameter. This one could decide, if the new strike-through works as in the GFM spec or treats single ~ as subscripts.

The default configuration could be compatible with GFM and something like subscript: true could enable the current behavior when both extensions are enabled.

jmooring commented 2 weeks ago

Putting aside the configuration settings for a moment, this is how I think it should work:

  1. With a fresh Hugo install, both single- and double-tilde strikethrough should be enabled. This is consistent with the current default and provides GFM compatibility.
  2. As soon as you enable subscripts, anything wrapped in single-tildes is a subscript

Assuming we have to add strikethrough to the extras extension, the default config should be:

[markup.goldmark.extensions.extras.strikethrough]
enable = true

Which leaves us with this to figure out:

[markup.goldmark.extensions]
strikethrough = true/false

I suspect that very few users have disabled strikethrough... I can't remember seeing it. So the easiest approach would be to completely remove the Goldmark strikethrough extension/config, and note the breaking change in the release notes, applicable only if they currently have strikethrough disabled.

bowman2001 commented 2 weeks ago

I could implement a new strike-through extension which includes the subscript without the configuration switch for a start. I am currently occupied and won't be able to finish this before the middle of the next month. But this is important to me and I would be happy to contribute.

jmooring commented 1 week ago

Just in case there are other near-term upstream bug fixes, I'd rather not wait, so perhaps we can do this in phases.

Phase 1

Remove the Goldmark "strikethrough" extension and create an extras "delete" (different name to avoid confusion) extension that only handles double-tildes. This would allow us to bump to Goldmark v1.7.2 without losing current capabilities.

There's some added complexity here because we also use the Goldmark "strikethrough" extension when rendering TOC entries, but TOC entries don't currently honor any of the extras features anyway. So maybe this is a separate issue.

We need to make sure we handle this test case too:

https://github.com/yuin/goldmark/blob/master/extension/_test/strikethrough.txt#L25-L30

Phase 2

Make the "delete" extension handle single-tildes if "subscript" is not enabled. This would give us full GFM compatibility, but in my view there's no rush for this. The primary objective should be to maintain current behavior when we bump to Goldmark v1.7.2.

bowman2001 commented 1 week ago

I already have an implementation of the new delete extension which mimics the behavior of strike-through in the last version 1.7.1 of Goldmark.

But the new test case in 1.7.2 does complicate things a little, because there is a corresponding change in the original strike-through parser to exclude the rendering of more than 2 markup characters like ~~~. Because we use the same parser for all extra inline tags, this change would also affect edge cases like +++, ===, ++++, and ====.

I think, that we can adopt this change, because I can't imagine that anyone wants to use +++insertion++ to produce <ins>+insertion</ins> or similar cases with ===. And nested inline tags like <del><del>doubly deleted</del></del> which are possible at the moment don't make any sense.

There is one reasonable edge case which can not be implemented with the current approach in phase 1: To use something like ~~CO~2~~~ to produce <del>CO<sub>2</sub></del> is reasonable but won't work.

What's your opinion on this?

jmooring commented 1 week ago

First, thank you very much for the rapid response.

Second, please submit a PR to the hugo-goldmark-extensions repository. Although I think I understand what you've described above, I'd like to take it for a spin to be sure.

Third, how we handle this on the Hugo side is probably the tricky part, and I've been giving this more thought. If we assume that only 5% of all sites enable the subscript feature, it would probably be best for the other 95% to continue to use the official Goldmark strikethrough extension instead of the extras extension delete feature. Options:

1) Handle this with documentation. For example:

If you enable the extras extension "subscript" feature you must disable the Goldmark strikethrough extension. To support both subscript and strikethrough, enable the extras extension "delete" feature.

2) Make the above happen automatically. While this sounds good, I'm afraid we'll end up with some spaghetti code. The additional complexity may not make sense given my 5% usage estimate above.

I'm inclined to handle this with documentation, at least for now. We can always revisit this in the future.

bep commented 1 week ago

I'm inclined to handle this with documentation, at least for now. We can always revisit this in the future.

Agree.