krzyzanowskim / STTextView-Plugin-Neon

Source Code Syntax Highlighting
MIT License
33 stars 13 forks source link

How to achieve both 'block' and 'inline' Markdown highlighting? #10

Open dvclmn opened 2 months ago

dvclmn commented 2 months ago

Hi Marcin, thanks for all the incredible TextKit-related work you've done — and shared. Your projects have been an invaluable resource.

Issue

When assigning language: .markdown as the NeonPlugin language, only block grammar such as fenced code blocks, quotes, lists, headings etc are styled.

As I understand it, TreeSitter can/does use a two-pass parse (ha) for markdown, for block and inline grammar? Is this correct? If so, will I need to implement my own code to achieve this second parsing operation — or indeed, is this even possible? Based on my very tentative TextKit intuition, this sounds expensive. But perhaps (hopefully!) I'm just missing a trick.

Reproduction

I have created an STTextView as per the basic defaults, as outlined in the readme's for this and the main project.

STTextViewUI.TextView(
    text: $text,
    selection: $selection,
    options: [.wrapLines, .highlightSelectedLine],
    plugins: [NeonPlugin(theme: .default, language: .markdown)]
)

My hope was to see such strings as "inline code", "italic text" etc styled in the typical fashion, but only the block elements mentioned above are receiving styling.

Thanks in advance for any help you can provide.

jeanetienne commented 1 week ago

Keen on this too, but I have no idea how to do it. Would you have any pointers Marcin?

krzyzanowskim commented 1 week ago

I think that is more of a question to tree-sitter/Neon itself I didn't research it yet myself too much. But I'd like to learn as well if there is something I can do just with the plugin

jeanetienne commented 1 week ago

Good point. I'll investigate with Neon and/or tree-sitter and report back in this thread if I find something to do in STTextView-Plugin-Neon too 👍

jeanetienne commented 1 week ago

Based on a few hours of investigation and reading code, my understanding is as follows:

Could the injection queries be the mechanism to achieve the second pass? By looking at the git history of the tree-sitter-markdown's repo, I have the impression the comment at the bottom of the README may be outdated, and the code comments at the top of their sample implementation (in bindings/rust/lib.rs) seems to also be out of date with the code.

I also had a look at Simon B. Støvring's Runestone iOS app. It successfully does Markdown block and inline highlighting. I had a look at his Runestone and TreeSitterLanguages repos but couldn't find evidence of a "manual" two pass. I then followed his documentation to integrate Runestone (the text view) into a sample iOS app, and added syntax highlighting for markdown. The result is that it doesn't highlight inline markdown automatically...

Conclusion

I have the feeling that injection alone is not sufficient, and achieving a good block & inline highlighting is probably down to some sort of "special case", or at least some advanced understanding of how TreeSitter works. Knowledge I don't have.

@krzyzanowskim, does this word salad make sense to you 😅? I'd be happy to help write the implementation but the domain knowledge is quite far above my head, I'd need some (initial) guidance. Alternatively, Simon B. Støvring is probably doing this in the closed source part of his Runestone iOS app, we could ask him for pointers?

I hope this helps 🙂

mattmassicotte commented 1 week ago

Hello! Neon author here.

Tree-sitter's injection system can do this reasonably well, it is supported by the tree-sitter markdown parser, and Neon supports it all. You can see an example of how this works here:

https://github.com/ChimeHQ/Neon/blob/main/Projects/NeonExample/TextViewController.swift

jeanetienne commented 6 days ago

Sweet, thank you @mattmassicotte for chiming in!

I played a bit with the sample project you linked and indeed managed to have a sample MD string to work 👍 (I couldn't highlight properly some "emphasis" and "strong emphasis" but I don't know if it's because of my quick & hacky hardcoded token styling, or from the grammar itself? Anyway, that's secondary to the main effort in this thread).

I barely understand the code in the plugin, let alone in Neon or TreeSitter... So I don't feel qualified to propose a PR just yet 😅

@krzyzanowskim let me know what's your recommendation/approach in terms of architecture in the plugin?

mattmassicotte commented 6 days ago

It is very common for the defined highlights that are included with tree-sitter parsers to be problematic, and this is definitely the case for Markdown. This could be why highlights don't work well. I'd be happy to look more closely with you though!

jeanetienne commented 3 days ago

I'd be happy to look more closely with you though!

Sure! How do you want to go about it?

mattmassicotte commented 3 days ago

@jeanetienne Here I've opened an issue: https://github.com/ChimeHQ/Neon/issues/51