Support for pandoc-crossref

UFOMelkor commented 1 year ago

Description

I guess full support for pandoc-crossref is far beyond everything that could be done in a timely manner. For some features, there might be things missing in the markdown parser from codemirror.

Proposed Changes

I am uncertain whether all of them can be done, but these are the changes I would have in mind:

[ ] Show caption-attribute of Code Blocks (would require #4383)
[ ] Show Code Blocks labels (would require #4383)
[ ] Show image labels
[ ] Show equitation labels
[ ] Autocompletion for references
[ ] Rendering of references

More advanced stuff would be something like

[ ] Table labels
[ ] Table-style captions for code blocks

Caveats

There might be some confusion if people export something from Zettlr without having pandoc-crossref enabled. Could pandoc-crossref be shipped with Zettlr?

Another caution is that this would be support for a custom filter. There are many custom filters, and supporting all would make Zettlr too complex for users and developers. The question would be, whether pandoc-crossref is worth it to be supported.

Do you Wish to Attempt Implementing this Yourself?

Yes

Zettlr Version

Beta (if applicable)

Your Platform

[ ] Windows
[ ] macOS
[X] Linux

Operating System Version

Ubuntu 22.04

Additional Information

No response

nathanlesage commented 1 year ago

Regarding the general syntax highlighting stuff, have a look here where I already included the basics: https://github.com/Zettlr/Zettlr/blob/develop/source/common/modules/markdown-editor/parser/pandoc-attributes-parser.ts

Right now, it searches for curly braces at the appropriate positions and marks them as "inline code". It should be fairly straight forward to implement some better colors by improving the detection and adding more fine-grained elements. In fact, the "code marks" (a.k.a. the curly brackets) already receive proper colors when used in headings, which is because of the appropriate CSS rule that applies there (it's actually sort of a visual bug that I didn't yet have time to fix).

Improving this thing should be ~~straightforward~~ (this is anything but straightforward, apologies already for the wall of text):

Make the pandoc attributes parser more sophisticated, enabling it to detect, all elements that Pandoc supports (e.g., #ids, .class-names, attribute=value, etc. – the Pandoc manual should have us covered here)
Ensure that those attributes are properly wrapped in Syntax tree elements for CodeMirror's LR-tree (see how the current parser does it: Wrap the curly brackets each in a CodeMark element, and the contents in an InlineCode one, and then finally add all of them to their wrapper element)
This likely would involve adding more elements, which is a tad tedious:
1. Add the new element in the parser to have it use it (Simply think of a proper name – this name needs to be declared in all steps below as well)
2. Declare those elements as new "customTags" here: https://github.com/Zettlr/Zettlr/blob/develop/source/common/modules/markdown-editor/util/custom-tags.ts
3. Those new custom tags can inherit from already existing ones, which makes them apply the same styling. For example, the YAML frontmatter start and end inherit like such: YAMLFrontmatterStart: Tag.define(tags.contentSeparator) (because those are basically also content separators)
4. Then, declare those custom tags in the Markdown parser as well: https://github.com/Zettlr/Zettlr/blob/develop/source/common/modules/markdown-editor/parser/markdown-parser.ts
5. Finally, assign a CSS class to them here: https://github.com/Zettlr/Zettlr/blob/develop/source/common/modules/markdown-editor/theme/syntax.ts
6. Style the elements accordingly in the themes (search for the other class names to get an idea as to what I've been doing): https://github.com/Zettlr/Zettlr/tree/develop/source/common/less
7. Cry, because, what the hell did I develop here (/s)
Potentially improve the AST parser to take those attributes into account, but I guess once you're here you're probably exhausted already …

Nota bene: Once these elements are added to the syntax tree, other plugins can access them, for example the image renderer, or others. This would involve rewriting that logic a little bit because Pandoc attributes are not children elements, but actually sibling elements, so it would involve a little bit more logic, but tbh I think it's fine if those elements are not (yet) considered by the preview renderers …

Then:

There might be some confusion if people export something from Zettlr without having pandoc-crossref enabled. Could pandoc-crossref be shipped with Zettlr?

Another caution is that this would be support for a custom filter. There are many custom filters, and supporting all would make Zettlr too complex for users and developers. The question would be, whether pandoc-crossref is worth it to be supported.

There is currently already a sort of "hack" that people can use: Zettlr will automagically load and run all Lua filters that it finds in its filters directory. However, that is quite obviously less transparent, which is why I'm not including it in the docs yet. But I guess it should be possible to imagine a workflow of loading filters more transparently. This is an issue that @sensologica should have a word with as the UI/UX lead.

So, to conclude: I see two PRs in this issue: One that doesn't have to be discussed as it only involves improving the Pandoc-attributes detector to properly show colors and detect only those attributes that Pandoc will detect as well. The second one would have to be discussed as this would involve a wholly new workflow for users to load and use Pandoc filters.

There are basically two ways I can imagine this going down:

Any filter that is in the filters directory could be (globally? Per export profile?) be enabled in the settings. Newly added filters could be disabled by default, removed filters need to be removed from the config as well.
Any filter in that directory can also directly be edited from within the assets manager, allowing users to add and remove filters directly from within the app. This way, adding new filters, not just pandoc-crossref, becomes a matter of copying and pasting from the web (with the appropriate safety precautions, obviously)
One issue we would need to address is what we do about the fact that users may need to customize the order in which these run, and that they can in fact also declare them in the defaults file. Changing that (as Zettlr currently does it) has the huge problem that this would defeat profiles that are installed, e.g., by a company for users and do not expect to be changed.

EDIT: I also saw the autocomplete. This would probably require the syntax extensions (see my first response block) and then one could have a custom autocomplete that regularly scans the syntax tree for Pandoc attribute divs, sees if there's a reference ID somewhere in there, and offer those alongside the heading IDs in the autolinking. So probably, simply adding another source (at least I believe CodeMirror to have support for "sources" for autocompletes) to the heading autocomplete should do the trick. This would then possibly be a third PR.

UFOMelkor commented 1 year ago

So, let me recap the next steps in my words:

We need a better attribute parser that would parse

```{#lst:code .haskell .numberLines execution_count=1}
qsort [] = []
```

to a tree like (without looking into the currently available elements, just a bit guessed)

FencedCode
- CodeMark (```)
- CodeInfo
  - CodeMark ({)
  - LabelAttribute (#last:code)
  - ClassAttribute (.haskell)
  - ClassAttribute (.numberLines)
  - ValueAttribute (execution_count=1)
  - CodeMark (})
- CodeText (qsort [] = [])
- CodeMark (```)

(And same for other occurrences of attributes.)

nathanlesage commented 1 year ago

This would be one way, yes. You could even go more fine grained and define separate tags for the property name, the equal sign, and the property value, but this is precisely the idea. Maybe dropping some code which you'd like to mimic from the colors into some fenced code block would help (this way you could precisely see how fine grained some parsers work). There are already good test files in the test setup.

Then all of these elements could be individually styled and with a little bit of care for optimisation, this should work very well.

Zettlr / Zettlr