jgm / djot.js

JavaScript implementation of djot
MIT License
141 stars 15 forks source link

Shared HTML id in multiple references to the same footnote #65

Open faelys opened 10 months ago

faelys commented 10 months ago

Hello,

when using several references to the same footnote, multiple a elements are generated with the same id attribute. Isn't that invalid HTML?

The simplest example I found is foo[^a] bar[^a], which currently generates in the playground:

<p>foo<a id="fnref1" href="#fn1" role="doc-noteref"><sup>1</sup></a> bar<a id="fnref1" href="#fn1" role="doc-noteref"><sup>1</sup></a></p>
<section role="doc-endnotes">
<hr>
<ol>
<li id="fn1">
<p><a href="#fnref1" role="doc-backlink">↩︎︎</a></p>
</li>
</ol>
</section>

I'm not sure what the appropriate solution would be (I'm not even completely sure whether it really is a problem, my HTML knowledge is a bit outdated), but I would have expected one backlink per reference (as is usual in wikipedia).

While there, I noticed that the backlink text is U+21A9, U+FE0E, U+FE0E. My Unicode knowledge is even worse than my HTML knowledge, but is the double variation selector-15 intended or useful?

jgm commented 10 months ago

Yes, it's invalid I think. We could simply avoid assigning an id to subsequent references to the same note. Or we could assign them unique ideas and include multiple back references. Both of these would involve some additional complexity.

The reason for the double variation selector is that it keeps iOS from substituting an ugly emoji for the unicode character. (At least, it used to do that.)

faelys commented 10 months ago

We could simply avoid assigning an id to subsequent references to the same note. Or we could assign them unique ideas and include multiple back references. Both of these would involve some additional complexity.

I can imagine doing any those in my renderer with almost as much complexity as my current reference-resolution and sorting-by-appearance, so I can't really weigh on this, I'll just follow along to keep the same test suite.

The reason for the double variation selector is that it keeps iOS from substituting an ugly emoji for the unicode character. (At least, it used to do that.)

Fair enough, I don't have easy access to that family of platforms to confirm. I've come across some standard-looking table saying a single variation-15 is supposed to explicitly request text-like rendering, but I wouldn't surprise me to need two of them as a workaround for something, so I'll just follow along.

jgm commented 10 months ago

Oh, I'd missed that it's a doubled variation selector. Hm, don't know why, I think that's probably just a mistake!

jgm commented 10 months ago

Now I'm really confused. I changed the source to use a hex escape so we know exactly what we're including:

    const backlink : string = `<a href="#fnref${ident}" role="doc-backlink">\u21A9︎︎</a>`;

But the output still contains the doubled variation selector. What is adding this?

faelys commented 10 months ago

But the output still contains the doubled variation selector. What is adding this?

Could it be a leftover which is not displayed by your editor? In the latest commit there is still the symbol, and both variation selectors are explicitly there:

% grep doc-backlink src/html.ts | xxd
00000000: 2020 2020 636f 6e73 7420 6261 636b 6c69      const backli
00000010: 6e6b 203a 2073 7472 696e 6720 3d20 603c  nk : string = `<
00000020: 6120 6872 6566 3d22 2366 6e72 6566 247b  a href="#fnref${
00000030: 6964 656e 747d 2220 726f 6c65 3d22 646f  ident}" role="do
00000040: 632d 6261 636b 6c69 6e6b 223e e286 a9ef  c-backlink">....
00000050: b88e efb8 8e3c 2f61 3e60 3b0a            .....</a>`;.
jgm commented 10 months ago

I never pushed the change, so you're seeing the old code.

jgm commented 10 months ago

Nonetheless I think you were right; there was a hidden character. I have removed it. Now I'll add back in one variation selector.