jgm / djot

A light markup language
https://djot.net
MIT License
1.66k stars 43 forks source link

[RFC] Underscore as link text delimiter #99

Closed fredericmoulins closed 1 year ago

fredericmoulins commented 1 year ago

Hi,

What if we weren’t chained to the past?

If I had a wish for a light markup syntax, it would be that links can look like _link text_(url) and _link text_[ref].

I see this as very close in spacing to a reference in academic papers, and very close in interpretation to a rendered output in any hypertext format as it reads:

The link text is meant to be read: it feels easier for me to scan text delimited by underscores than text in square brackets that seems enclosed in a secondary context.

A link text is supposed to allow for inline markup: the idea is that the uderscore delimiters lose their emphasis meaning here, and the precedence rules already apply in the same way as for brackets in case of overlap or well-balanced markup inside the link text.

This set of changes is an attempt to see if it is possible to implement it on top of djot and its precedence rules for inline syntax, and with what limitations.

The first commit adds underscore "_" as a link text delimiter. At least, it shows it is possible to have several delimiters at the same time, but:

The second commit restricts to only underscore as link text delimiter, and the third modifies the tests.

Apart from the syntax change, three test cases have been adapted:

Later commits try to resolve the breaking tests, notably the last two commits.

There is a bit more detail in each commit message.

Some notes.

Is there anything I am missing, any usage or blind spot, that would render this unusable or impractical?

What do you think?

matklad commented 1 year ago

No strong opinion here, but, if wee do this, we might go for _link text_<url> rather than _link text_(url) to piggyback on the auto-link syntax.

jgm commented 1 year ago

I don't think I'd be in favor of this change. Better to keep link syntax more distinct from emphasis, and the [..] delimiters have nice directionality.

fredericmoulins commented 1 year ago

Thanks for your comments.

No strong opinion here, but, if wee do this, we might go for _link text_<url> rather than _link text_(url) to piggyback on the auto-link syntax.

I think the key difference is that the auto-link syntax allows only (some) absolute URI links. There seem to be two parsing modes, one for absolute URI only and another for any URI destinations.

I don't think I'd be in favor of this change. Better to keep link syntax more distinct from emphasis, and the [..] delimiters have nice directionality.

I agree that for a complex case, like emphasis or image in link text, it is less obvious to read for a lack of distinction. Two thoughts pushed me to try:

To be distinct from emphasis, another possible choice could be some quotes. Double quotes work as is, here, just by changing the link text delimiter restriction in Tokenizer.between_matched, for example.

To keep a single character, distinction, and directionality, yes, ASCII is pretty limited so I guess that's one of the reason why square brackets were chosen originally.

fredericmoulins commented 1 year ago

I won't leave this hanging indefinitely. Thank you for the feedback!