jupyter-book / myst-spec

MyST is designed to create publication-quality, computational documents written entirely in Markdown.
https://mystmd.org/spec
MIT License
14 stars 6 forks source link

Remove YAML dependency from directive option parsing #51

Open chrisjsewell opened 1 year ago

chrisjsewell commented 1 year ago

Currently there are two ways of specifying options within a directive:

(option 1) enclosing in ----

```{name}
---
option1: value
option2: value
---

(option 2) prepending all lines by `:`
:option1: value
:option2: value

Firstly, *there should be one clear way of doing things*, and so it would be ideal to remove one of these.

Secondly, the following logic proceeds for converting them to the "final" input options for the directive:

1. identifying the full block of text
2. parse it with YAML (and abort if the result is not a dictionary)
3. convert all the values back to strings
4. convert the values back to specific value types (and validate) by "converters" specified by the directive implementation

Clearly here the YAML value parsing is unnecessary, and worse can lead to discrepancies, such as `a:` becomes `{"a": null}` as opposed to `{"a": ""}`.
YAML is also quite complex (see e.g. [here](https://utcc.utoronto.ca/~cks/space/blog/tech/YamlComplexityProblem)) and not really necessary for the more simple requirements of option parsing.

If we accept that it is the directive implementation's responsibility to do any conversions from strings,
then we simply need a syntax/format that maps string keys to string values.

There are two ways to do this that come to mind:

1. Something like field lists (see the [rST spec](https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#field-lists), and [mdit-py implementation](https://mdit-py-plugins.readthedocs.io/en/latest/#field-lists)), i.e. very similar to the current (option2)
   :name: x
   :class: y
   :other: z

   (It is of note that in the field list spec, keys are parsed as Markdown, that is not what we want here though)

2. Block attributes before the directive (see [here](https://htmlpreview.github.io/?https://github.com/jgm/djot/blob/master/doc/syntax.html#block-attributes))

{#x .y other=z}



---

It is of note, that the only place where we definitely need a direct mapping of `options <-> JSON` is in `code-cell`, whereby the options actually map to the metadata of a Jupyter Notebook code cell.
In this case though, `code-cell` can be viewed as a "pseudo-directive" and perhaps should have a different syntax, so as not to be confusing.
mgielda commented 1 year ago

Hope my chiming in here is OK. I definitely think option 1 is the most clunky and we tend to stay away from it. Very often you would only have 1 option you want to specify and then you need to write 3 extra lines to do it, also it's just ticks, hyphens, colons everywhere at some point, which is confusing. I like the attribute syntax for "native markdown code blocks" which you just implemented, but not sure about using them for all the other attributes... but maybe?