Open matklad opened 10 months ago
Djot was beyond Markdown, keeping its legacy:
{.myblock}
:::
This is _Djot_{.underline}
- Apples.
- Oranges.
> Quote
:::
This proposal opens Pandora's box:
{.myblock}
:::: div
::: p
Are we saying this would be the new em:[u:[Djot{.dubious}]]?
:::
::: ul
li:[Apples.]
li:[Oranges.]
:::
::: blockquote
Quote
:::
::::
And I wonder what a LaTeX renderer (say, or SILE) would then do. Have to support div
, p
, ul
etc. environments and commands, or magically recognize a subset of the HTML (and why not, DocBook or any random schema) tag set to map them to appropriate commands?
I am afraid the problems supposedly solved might be worse. Or did I miss something?
And I wonder what a LaTeX renderer (say, or SILE) would then do.
There's no any special handling of tags. For example, the SILE renderer would do exactly what SILE XML Flavor would do, namely, interpret the document as
\begin[class=myblock]{div}
\begin{p}
Are we saying this would be the new \em{\u{\span[class=dubious]{Djot}}}?
\end{p}
\end{div}
This might, or might not produce a valid SILE document, depending on which custom SILE commands the user has defined.
Stated positively, the user gains access to all their pre-existing custom SILE commands without having to define custom Djot renders or filters. So, if the user has
\define[command=red]{\color[color=red]{\process}}
defined, they can use
Making things red is a red:[silly] way to emphasise text.
in their djot
Well I am afraid I have to disagree on everything then...
(I don't think it's the place to discuss the SILE examples, but the SIL language should be completely avoidable, and the user shouldn't need custom commands to do this kind of things. Styles are a better paradigm with a nicer separation of concerns)
Still, an additional comment though:
The user gains access to all their pre-existing custom SILE commands without having to define custom Djot renders or filters.
Would the user really want to do this, with markdown.sile they don't need to define custom Djot renders or filters, indeed. The following works:
``` =sile
% Or defined in Lua with =sile-lua, or implemented elsewhere in a class, package, wrapper document, your call.
\define[command=red]{\color[color=red]{\process}}
```
Making things red is a [silly]{custom-style="red"} way to emphasize text.
And all other things equal, it does work identically whether the input is a Markdown file, a Djot file, or a Pandoc JSON AST[^1]. (Not that I really recommend this, but it's already available[^2], though again I would recommend using styles rather than direct commands)
[^1]: For the rare cases (now) where some syntax extension not covered by the native implementation would be needed. Tables in some other format than the "pipe tables" in (extended-)Markdown, for instance. [^2]: EDIT: And, before someone asks, I favored the custom-style key trick over a class attribute there indeed: influenced, for what is worth, by what Pandoc does with Word docx conversion -- so that defining a "red" character style in a Word reference document and converting to docx with Pandoc should indeed then also work as intended.)
This is an interesting and well thought-out proposal. It does go in a somewhat different direction than I'd originally had in mind, but I see its good points.
The original conception was that if you wanted to do something like <details>
, you'd simply write
::: details
## This is the summary.
And here's the rest.
:::
and then make use of a filter that replaces this with AST nodes including the raw HTML <details>
, <summary>
, etc. It's true that the filter needs to be format-specific -- though in pandoc at least, filters can conditionalize on the output format (I forget whether we built that into djot.js).
This proposal would allow you to do
:::: details
::: summary
This is the summary
:::
And here's the rest
::::
which is a bit more verbose and relies more on English keywords, but it would work out of the box without filters.
The proposed change would be breaking for existing djot documents that used
::: classname
...
:::
but maybe that is okay as the language is still in an experimental phase.
The proposed change would make the djot AST less compatible with the pandoc AST (which doesn't have a notion of "tag name"), and this would make pandoc interoperability less smooth.
In general I don't like to rely on English language keywords. Perhaps one could work around that, though, by introducing the concept of a "tag dictionary" that allows you to define your own aliases for tag names?
If we did implement the prefix :defn[]
style notation, it would be good to impose some restrictions on the characters allowed in tag names and also a length restriction, to keep parsing fast.
You are right that allowing a special name for spans restores symmetry with what we now have for divs. However, there's also a question of symmetry with verbatim containers (code spans and code blocks). For example, in LaTeX you might want
``` tikz
arrow(whatever) -> node(thing)
to produce a `tikz` environment instead of `verbatim`. But doing this automatically conflicts with the role we've given to this position for specifying the "language." There's also a question whether code spans should have something similar? `` :kbd`*3b*` ``
As for syntax, I fear that the tag name in `:tag[...]` looks a bit too much symbol syntax (just missing the final `:`). Of course, we could remove that problem by just *using a symbol* for this purpose: `:tag:[...]`, but this might not be ideal. Another option could be `!tag[...]` which is reminiscent of the image syntax.
@Omikhleia
This is fragmenting the portability of input files
Yeah, that's the big thing here! One can view Djot as eihter:
This proposal pushes us more towards the second interpretation (but note that they are not mutually exclusive --- some people may use djot as 1, and some might use it as 2)
As you've rightfully notice, everything expressible with this proposal is already possible with custom attributes and classes, the "custom tags" thing just basically formalizes this pattern.
And that nicely segues in @jgm first point! Even under this proposal I would expect people to write
::: details
## This is the summary.
And here's the rest.
:::
and handle this as a filter by default. The "raw html" mode I think is needed solely as an escape hatch.
However, under the new proposal its syntactically apparent that ::: details
is some custom element. In the status quo with using "magical" classes, it's less clear whether that's indeed a custom element, or just a pure-style .class
.
That's probably what I like aesthetically most here --- that we clearly separate the "semantics" attribute from the style ones (including adding invariant that there's at most one custom tag, but many classes).
relies more on English keywords
I was under the impression that we already don't restrict class names and such to be English, but apparently that's not the case. It feels a bit strange that the following is parsed differently
x{.foo} x{.бар}
I would say if we are fine with class names being English, we should be fine with tag-names being English also (but it might be a good idea to include some quoted syntax then just in case, eg ::: "бар"
to be analogous to {class="бар"}
).
but maybe that is okay as the language is still in an experimental phase.
FWIW, this is something that worries me quite a bit. The page https://djot.net doesn't say that Djot is in an experimental phase, and makes it look like its quite finished. Ideally, we'd be more clear with communicating our stability promise.
As for syntax, I fear that the tag name in :tag[...] looks a bit too much symbol syntax (just missing the final :)
Yeah, I think syntactically the salient bits are that:
As for particular syntax, !tag[
definitely works!
I don't think there was any intention to exclude non-English class names! If we do it seems like a bug. The attribute grammar in attributes.ts does say that keywords need to be ascii, but not classes or identifiers.
See also #197 and #192 where I proposed another use for ::: tag
, namely to provide "hints" for the parser.
I'm thus all for storing these "tags" specially in the AST. What worries me is that this proposal seems very HTML-centric for such a "central" syntax feature. I think it is important that djot is output-format agnostic, not favoring any one output format. While I do not yet use djot for real (the lack of a metadata — and other data in the spirit of #192 — syntax which is interoperable with Pandoc is the main show stopper for me) I really like most of the syntax features where djot differs/adds to Markdown, but my typical target format is PDF via LaTeX. If this means "tags" are stored separately in the ast and can be used for anything by parsers, filters and renderers I'm all for. If this means that "tags" become unusable unless you target HTML/XML, or even djot gets tied to those formats I'm actually worried!
As a data point, someone laments the inability to create HTML/djont sandwiches without writing custom filters:
I find this very useful as well. Most notably, the <details><summary>
combo.
This proposal is a synthesis of #239 and #146 and organized in TL;DR, What? and Why? sections, where the Why? is the most important.
TL;DR
Change djot such that the following input:
produces the following HTML:
What?
Specifically:
Change the parsing rule for
::: spam
to use"spam"
fortag_name
, rather than a class.Changing parsing rules for bare
:::
and[]
to settag_name
to"div"
and"span"
, respectively.Add new concrete syntax
:tag-name[]
, that is,:(\S+)\[
where$1
, an arbitrary sequence of non-whitespace symbols, is atag_name
, and the rest is the usual span syntax. This concrete syntax produces aSpan
AST with the correspondingtag_name
set.Change default HTML renderer to use
tag_name
when renderingspan
anddiv
elements.The most invasive change here is
4
, as it adds a bit of new syntax to djot and directly enlarges the surface area.Why?
This single solution fixes several "problems" in the current version of djot, some big an some small. I list them roughly in order of priority:
Problem: users need a lightweight approach for producing custom HTML interspersed with normal djot.
Today, djot provides a
``` =HTML
syntax to embedded raw HTML (or any other format). The problem here is that its all-or-nothing: everything inside=HTML
needs to be HTML. You can't use that to wrap a part of a djot document into a custom tag:This is Djot!
Ok, this is Djot :weary:
::: details This is still Djot :smile: :::
\begin{environment}
\end{environment}
::: spam :::
{.spam} ::: :::
{.spam} ::: eggs :::