Open chrisjsewell opened 1 year ago
More generally, from an extension viewpoint, it might be nice if there was a standardised prefix for all extension types.
For example, let's say the prefix was =
for now. Then this input:
[a]{=name} `a`{=name}
:::=name
a
:::
```=name
a
Went to AST like:
doc para ext_inline role="name" str text="a" str text=" " ext_inline_verbatim role="name" text="a" ext_block role="name" para str text="a" ext_block_verbatim role="name" text="a\n"
In view of {=format}
already being taken by raw content as mentioned in #77[^1] and {:lang}
already being suggested in #5[^2] and :emoji:
already being taken what about
:::>role
Content
:::
[content]{>role}
which however might be problematic due to <...>
being associated with {HT,X}ML (but most good punctuation chars are already taken![^3])
[^1]: Ultimately due to a Pandoc filter yours truly wrote which converted `code`{raw=format}
into raw elements 😥
[^2]: IIRC also originally my own fault due to misunderstanding a short description of the CSS *:lang(subtag)
pseudo-class! 😥😥
[^3]: IOW I'm going to regret this too! 😥😥😥
Are classes not sufficient?
@kaleb I think the idea is that classes and roles are semantically different and/or that you may not want role markup included as classes in rendered HTML. I can sympathize, since I sometimes use Pandoc filters to remove certain attributes in HTML output, but I'm not sure that it needs to be handled any other way.
Reading https://borretti.me/article/brief-defense-of-xml made me think that we do need some explicit support for this.
I think being “extensible” is one of the greatest features of djot, and, if we can write in the manual “this is how you define custom elements: …”, that’ll help the users to grok the capability.
You can do this today with class or your own attribute, if you already know that it’s possible. Adding an explicit facility would help to unlock people’s imagination.
There’s also a problem that with class the role name goes after inline element, while it wants to be the first.
So perhaps we should steal a page from adoc here?
Custom inline element:
role:[inline content here]
Custom block element:
role:::
Block content here
:::
We can restrict this to spans and divs (as that’s where
assigning the role makes most sense), but we also can
allow it on arbitrary fenced elements: kbd:`ctrl+F`
The thing is that it would be annoying to have to type
not-role\: [not role text]
. In most Markdown flavors you need to type
[not link text] \(not-url)
which is annoying because (a) it is common
enough to happen with some frequency if square brackets have a meaning
in your field and (b) it is uncommon enough that you forget to escape
the parenthesis when it happens, and you might not always have syntax
highlighting to alert you to it.
I do not really see why the role indicator needs to come before the
text it applies to. Some Other LML does it? Well djot isn't that other
MLM. I am far from convinced that it is a good idea for attributes to
come after the text they apply to[^1] but it's already traditional and
djot relies on it to distinguish block/inline attributes coming
before/after what they apply to to distinguish them, and it's good
enough, so I really don't see why something like [text]{>role}
wouldn't be good enough. It is at least consistent with how djot does
things already (and I think the >
"arrow" is kind of iconic! 😁).
As for blocks it is already the case that in
```lang
code
```
lang
is not quite a regular class so I do not see why
:::role
text
:::
wouldn't be good enough, and it's consistent, although
:::>role
text
:::
substituting >
with whatever is used for inline roles would be
consistent with inline roles.
[^1]: I find postposed attributes annoying enough that I have a snippet
such that I can type something like ,a,ATTRS,TEXT,<CTRL><TAB>
and
have it automatically transformed into [TEXT]{ATTRS}
. The main
disadvantage of typing []{}
is that those are shift 2 and shift 3
on my keyboard, but with the snippet I usually don't need to use any
shift at all. I can substitute ,
with anything matching the
character class [^\w\s]
not occurring in ATTRS
, and if ATTRS
matches the regex ^[-_A-Za-z0-9]+$
an .
is automatically
inserted before it. I have another snippet turning
,wl,Some Text[,url][,pfx-][,-sfx],<CTRL><TAB>
, (where [...]
indicates optional parts) into [Some Text](url#pfx-some-text-sfx)
and a similar ,tgs,Some Text[,pfx-][,-sfx],
→
[Some Text]{#pfx-some-text-sfx}
. Makes my day! 😁)
The thing is that it would be annoying to have to type not-role\: [not role text]
I envision requiring that :
has no spaces around it [
, and alpha:[
seems like it would be a pretty rare combination naturally.
As to why it makes sense for role to be prefix -- the difference between role and other attributes is that it affects semantic interpretation of what follows, so it helps to see it first. Eg, kbd:`Ctlr+F`
to me reads better than `Ctrl+F`{>kbd}
, because kbd
introduces a DSL (+
is a meta char separting keys), so, if kbd
comes first, by the time you get to +
you already know that it'll be interpreted as a separator. That's the same deal as with $`e=mc^2`
: for the parser it is more or less the same `e=mc^2`$
, as it doesn't give meaning to the stuff inside backtics. For the human, "here comes the math" introducing syntax makes more sense.
This would make me very happy and secure my wavering Djot fandom. I agree that prefix is better for this and like @matklad's proposed syntax, but if it's considered too clash-prone then perhaps there could be a prefix punctuation character on top? (I'll throw @
into the bikeshed discussion if so.)
Similar to roles in docutils (https://docutils.sourceforge.io/docs/ref/rst/roles.html):
:name:`content`
, it would be nice to have an attribute shorthand, to provide "semantic meaning to content" (see also https://developer.mozilla.org/en-US/docs/Web/Accessibility/ARIA/Roles).This would provide a clear hook for AST preprocessors to use, as discussed in https://github.com/jgm/djot/discussions/77#discussioncomment-4269320
Currently, one could obviously just use
`content`{role=name}
and[content]{role=name}
, but this is quite verbose if you are going to be using it often.Similar to
id=name
being shortened to#name
, it would be nice to use a prefix.The one that comes to mind is
=name
, i.e.`content`{=name}
and[content]{=name}
, although this is currently used for raw-inline 😬