jgm / djot

A light markup language
https://djot.net
MIT License
1.67k stars 43 forks source link

Heading references should be explicit about punctuation #68

Open clarfonthey opened 1 year ago

clarfonthey commented 1 year ago

Identifiers are added automatically to any headings that do not have explicit identifiers attached to them. The identifier is formed by taking the textual content of the heading, removing punctuation (other than _ and -), replacing spaces with -, and if necessary for uniqueness, adding a numerical suffix

it's not clear how Unicode-aware this is; this could mean that only ASCII punctuation is stripped (leaving other Unicode characters verbatim) or it could mean that all characters that aren't _, -, letters, and numbers are all stripped. This should probably be clarified. Personally, I feel that the best approach would be to simply declare that while implicit references will work, unless an explicit {#ID} attribute is added, the reference can be replaced with anything.

Personally, I think that simply relying on {#ID} is best, since people can still use the implicit references and if they need to make a reference that's stable outside the document, they can explicitly define one.

jgm commented 1 year ago

In the current implementation, only ASCII punctuation is stripped...mainly because Lua doesn't have good built-in unicode support. But your idea of telling people that they shouldn't have any definite expectations about automatically generated ids, and should always use an explicit id when they want to enable linking to a section from outside a document, is probably a sensible one!