vim-pandoc / vim-pandoc-syntax

pandoc markdown syntax, to be installed alongside vim-pandoc
MIT License
424 stars 61 forks source link

Add syntax highlight variant based on CommonMark using external parser #328

Open alerque opened 4 years ago

alerque commented 4 years ago

See primary discussion at #327.

See CommonMark Spec.

This will be a long running effort. Please feel free to pitch in with ideas or pull requests against this branch.

alerque commented 4 years ago

Nesting needs to work like this:

Container blocksLeaf blocksInlines

In that order.

alerque commented 4 years ago

I'm not sure where CommonMark leaves Pandoc's div syntax. An extension with an extra kind of Container block?

fmoralesc commented 4 years ago

Fenced divs? Those should be a container block.

alerque commented 4 years ago

Fenced divs? Those should be a container block.

My point was the CM spec at this point has three types of container blocks, and fenced divs are not one of them. I believe there is an allowance for them somewhere (I remember discussing this when Pandoc was considering adding the syntax) but I can't remember how it handles this. Extensions I think.

alerque commented 4 years ago

I mistakenly branched this work off of something 52 commits behind master. I just rebased and tweaked how it is launched to make it easier to experiment with different approaches.

let g:pandoc#syntax#flavor#commonmark = 1 to enable my (empty) alternate, default is to keep using flavor legacy which should be pretty much a passthrough to what master does now.

alerque commented 4 years ago

@fmoralesc I know for the sake of interfacing with vim-pandoc you want to get this working in Pandoc ... and that's understandable. I have my own reasons for wanting to get this working in Lua. First I need the general highlighting method work from Lua for another project that this is just a proof of concept for, and also I want to be able to wire SILE internals directly into my vim buffer. Hence I'm pushing forward with being able to do both.

I rewired the Rust code so that it can generate both Python and Lua native modules from the same basic code. I actually got it working without the extra wrappers, but the trait handling was messy because pyo3 mucks around with the internals of functions and doesn't like aliased types being passed in.

I got back to basically where I was with Lua, but this time with Rust code using native types and only converting to Lua structures at the last minute. I did not get the Python as far along as your version yet, but I'll keep working on it some. I wasn't sure where to put the Python code half on the equation relative to the rest of the plugin.

fmoralesc commented 4 years ago

First of all, great job!

Ultimately for vim-pandoc I don't think the language we write this thing in matters, and I don't think python is a requirement. I just wrote my python version because I wanted to have something to toy around from the stage that we interface with vim (how to handle multiline blocks and so on, get an idea of the performance...) As you can tell, I'm way more confident in python than in anything else ;)

fmoralesc commented 4 years ago

I left some comments in gitter earlier today, I'm copying them here for reference:

Last night I implemented an algorithm to get around the multiline limitations of nvim's add_highlight mechanism. I think we need to ditch the legacy groups, because of the differences in the "parsing" strategies there won't be a 1:1 mapping anyway. I also found we don't only need to capture the Start events, we also need to capture the offsets of Code, Html, FootnoteReference, Rule, and maybe HardBreak.

A weakness of the mechanisms we are using now is that they don't seem to be introspectable: once we push highlighting to the buffer, we cannot retrieve much information about it back (this is something we need for context-aware autoformatting in vim-pandoc, for example). I just discovered that we can retrieve the position of the boundaries of the highlight elements with nvim_buf_get_extmarks(0, ns_id, 0, -1, {}) (this emits output like: [[85, 0, 0], [86, 1, 0], [87, 2, 0], [88, 3, 5], [89, 3, 15], [90, 5, 6], [91, 6, 0], [94, 6, 53], [92, 7, 0], [93, 8, 0], [95, 10 , 0], [96, 11, 0]]). But I can't find a way to retrieve the hlgroups.

alerque commented 4 years ago

I'm actually learning a lot from seeing your Python work, I don't think it's a waste to mess around with what vim/nvim allow different language interfaces to accomplish.

By the way I stuck a Makefile in my branch to make it easy while hacking on this to build both modules. Eventually we'll have to refactor that into viml with a bunch of detection code to make sure we handle different vim versions correctly, but it's enough for hacking in Arch with Neovim for now. Just make all to get set up.

fmoralesc commented 4 years ago

I'm actually learning a lot from seeing your Python work, I don't think it's a waste to mess around with what vim/nvim allow different language interfaces to accomplish.

Same here for me with your rust and lua work. Earlier you said you didn't know what to do with my python code. Well, just steal the ideas and incorporate them in the lua code! :wink:

I think that as far as the highlighting system goes, it makes sense to have only one language in the codebase interacting with vim (at some stage vim-pandoc used viml, python AND RUBY... those were the days... and we had python in the syntax file... :facepalm: I guess we have come full circle EDIT: and let's not forget that if commonmark-hs finally pans out, we might want to have the syntax interact with a haskell library).