vim-pandoc / vim-pandoc-syntax

pandoc markdown syntax, to be installed alongside vim-pandoc
MIT License
425 stars 61 forks source link

codeblock syntax highlighting delegation #14

Closed blaenk closed 10 years ago

blaenk commented 10 years ago

It'd be very nice if we could rig something up so that codeblocks get highlighted with the appropriate syntax file. vim-markdown has something like this. I think it works similar to how you currently delegate latex highlighting to latex, and yaml to yaml. It'd be nice to have this, at the very least if it were opt-in.

We already have pandocDelimitedCodeBlockLanguage which contains the codeblock metadata. There are mainly two formats:

``` python
~~~ {.python}

I think it should be very easy to get the language out of the first one, since it's just one word. The second case can be handled by parsing the first class listed. In the worst case we can have something like what vim-markdown implemented, which is essentially a list of allowed languages and then we can see if any one of those is present in pandocDelimitedCodeBlockLanguage and then use that.

This would of course be as straightforward as possible; if it can't find the language then it'll continue doing regular general highlighting as is currently done.

blaenk commented 10 years ago

I worked a bit on this and managed to get it working, take a look at my codeblock-highlight branch, specifically this commit. Notice I purposefully hardcoded scala in for now, although eventually we'd probably do something like vim-markdown (iterate through whitelist of languages specified by user), I wanted to get one language working first expecting most others would fall in line if my patterns were good enough.

You can test it with this document if you like, use vim-scala for highlight support.

Alternatively you could just change s:lang_name and s:lang_syntax to whatever you like. I've also tested it on Haskell with this document, as well as C++ with this one.

The tricky bit was that unlike vim-markdown that simply treats everything inside the delimiters as code, we instead have to match for the beginning delimiter to conceal it with the lambda, parse the language, then match everything after the language and before the last delimiter as the actual code (e.g. contains=@whatever), that's why it might look a bit messy.

There are three problems I've encountered so far.

  1. testing cmake with this document exhibits a bug where the cmakeString group engulfs the end delimiter.
  2. also in that same document, the cmake highlighting seems to be off. It seems there are no problems with single-worded keywords, but those that have underscores such as include_directories only get the include part highlighted, and _directories gets highlighted as pandocDelimitedCodeBlockCode_cmake, and then everything after that is fine again. Maybe it's a problem with how I implemented this? Tehre's a syn match group called pandocDelimitedCodeBlockCode_cmake that matches everything after the:

    ``` cmake

    and before the final:

    
    
    or at least that was my intention. This match group then has a `contains=@blah` directive because I wanted its contents to be highlighted by `syntax/cmake.vim`, but perhaps I misunderstood the docs.
  3. It seems the solution in #18 is no longer sufficient. When I zR and then slowly CTRL+F through the file, everything is fine and then eventually when I reach a section of the document (such as "pattern matching" in the scala document) it'll trigger the bug. It can also be triggered the same as before, zRG. Interestingly though, out of the documents I linked in this post, I can only see it happening in the scala document.

Been wanting this feature for a while, and it seems to work very well in the majority of the cases, so this is pretty exciting :) Hope you can share some insight.

blaenk commented 10 years ago

Here are some screenshots:

scala

haskell

blaenk commented 10 years ago

Using syntax sync fromstart instead does fix the third issue, but I think you said it'd be too slow or something. I didn't notice any different but I didn't use it for an extended period of time I guess.

fmoralesc commented 10 years ago

syntax sync fromstartwould be slow for long documents, since it has to pass though the whole file. I think we should go back to that once we settle things down, for now it seems like premature optimization.

Anyway... all this looks pretty nice!

We could enable this for the top 20 programming languages according to TIOBE, and a few others (scala, haskell, maybe go), excluding some others (SQL variants, Visual Basic, Pascal).

blaenk commented 10 years ago

Yeah it's definitely very nice. Some default enabled would be nice I suppose, but I think a whitelist will be needed as well for those who want to enable languages that aren't covered there.

I'm still trying to figure out the CMake bug, which is more important like you said (I agree the highlighting/sync bug is a premature optimization compared to this).

Problem 1 only happens when there's a string in the codeblock. I think it may be related to 2. I think it may have to do with the pattern I'm using for pandocDelimitedCodeBlockCode_cmake, specifically the \_.\{-} part, but I'm not sure.

blaenk commented 10 years ago

Also, a minor design suggestion: notice in the screenshots the language/metadata after the start delimiter is not italicized or "highlighted." I personally really prefer it this way. I know it depends on color scheme, but on mine (the popular 'solarized' light version) it looks very weird since it shows up as gray and italicized, pretty hard to see. I would really love it if we got rid of that style, so that it looks like in the previous screenshots, it looks very strong and declarative that way I think, in other words, give it the "unformatted" look (which I think is implicit when you don't explicitly style it). Specifically, get rid of:

hi link pandocDelimitedCodeBlockLanguage Comment

I know this is a design/subjective suggestion though so it's up to you. When it's italic/styled, however, it annoys me since everything else looks very nice. I know I can hack this into my color scheme, but I wanted to suggest it in case you didn't mind how my suggestion looks, because I think it'd make a better default.

fmoralesc commented 10 years ago

I don't have a strong opinion either way. I linked pandocDelimitedCodeBlockLanguage because it sort of made sense semantically (the language is metadata annexed to the fenced block, so Comment fitted well). With the colorscheme I'm currently using (a personal variant of molokai), normal text has way too much contrast (basically white on black), so I don't like it much. Maybe link to Identifier instead?

Now, to start a flame war (just kidding, though): solarized is rubbish :p

blaenk commented 10 years ago

Haha yeah I didn't really like solarized at first, I had my own custom theme forked off of tomorrow-night. A while back though, out of nowhere, I took to really liking the light version of the theme (I think the dark version is horrible). As to why you linked it with Comment, that's what I figured. However, I don't think being strong/high contrast is a bad thing for this particular thing. We're already concealing the two ends of the codeblock with only a lambda and a word (usually) to denote that it's a codeblock, add to that this new syntax highlighting feature, and I just figured it'd be nice to have something strong/assertive as to what follows from it.

I'll play around with some other styles and see what you think, but that's less of a priority.

I could use some help with the cmake bug though, when you have time. It's the only problem I've been able to find from the languages I've tested so far. It only happens when there's a string in the codeblock. Also the fact that some of the codeblocks don't get properly highlighted. I don't know if it's the cmake syntax files' fault (somehow doesn't lend itself to embedding, after all I've tested with scala/haskell/cpp codeblocks containing strings, no problems), but I think it may be more a problem with the patterns I chose. I've never done this kind of thing before so I was hoping you could provide some insight, when you get a chance.

fmoralesc commented 10 years ago

I'll see what I can do. BTW, what timezone are you in? It seems you can stay later than I can ;) I'm in GMT-4.

On the Cmake thing, only thing I could think of is cmake's syntax file finishes if b:current_syntax is already set, but you already reset it. Maybe the conditional in line 57 of syntax/cmake.vim makes it fail when embedded?

blaenk commented 10 years ago

I'm in Southern California, GMT-8. Alright I'll look into that.

blaenk commented 10 years ago

I made the top conditional (if current_syntax exists) be elseif 0 and I made the one at line 57 be if 1 and it didn't seem to have any effect.

Actually, just looked at syntax/haskell.vim and noticed that these exact conditionals exist there too (and haskell works fine). The only main difference I notice at the top is:

let s:keepcpo= &cpo
set cpo&vim

I looked up what cpo is but couldn't make sense of it. The cpp.vim file doesn't have this cpo line either.

fmoralesc commented 10 years ago

What happens when you comment out those lines?

blaenk commented 10 years ago

No effect actually, weird. I'm starting to think it may be the patterns I made, but it's weird since the others don't have this issue.

blaenk commented 10 years ago

If you haven't had a chance to try it, this is what the bug looks like:

quote

This is what it looks like when it doesn't highlight correctly:

one

compared to an actual cmake file:

two

blaenk commented 10 years ago

Alright well, it seems like the string thing is a bug in the cmake plugin. It's defined like this:

syn region cmakeString start=/"/ end=/"/ 
            \ contains=CONTAINED,cmakeTodo,cmakeOperators

I got rid of the CONTAINED argument and it solved that problem. Still have the problem with inconsistent highlighting though in the last image.

blaenk commented 10 years ago

Disregard, that was not the problem and it's not a bug. That's there to allow variable interpolation, "${VAR}". But it does seem related to the issue.

fmoralesc commented 10 years ago

I just tried it, but I attempted to simplify your rules to:

unlet b:current_syntax
exe "syn include @pandocDelimitedCodeBlock_" . s:lang_syntax . " syntax/" . s:lang_syntax . ".vim"
exe "syn region pandocDelimitedCodeBlock_" . s:lang_syntax . ' start=/^\z(\(\s\{4,}\)\=`\{3,}`*\)\s*' . s:lang_name . '/ end=/\z1`*/ skipnl contains=pandocDelimitedCodeBlockStart_' . s:lang_syntax . ',@pandocDelimitedCodeBlock_' . s:lang_syntax . ' keepend'
exe "syn region pandocDelimitedCodeBlock_" . s:lang_syntax . ' start=/^\z(\(\s\{4,}\)\=\~\{3,}\~*\)\s*{\(.\+\s\)\?\.' . s:lang_name . '\(.\+\)\?}/ end=/\z1\~*/ skipnl contains=pandocDelimitedCodeBlockStart_' . s:lang_syntax . ' keepend'
 exe "syn match pandocDelimitedCodeBlockStart_" . s:lang_syntax . ' /\(\_^\n\_^\(\s\{4,}\)\=\)\@<=\(\~\{3,}\~*\|`\{3,}`*\)/ contained nextgroup=pandocDelimitedCodeBlockLanguage_' . s:lang_syntax . ' conceal cchar=λ'
 exe "syn match pandocDelimitedCodeBlockLanguage_" . s:lang_syntax . ' /\(\s\?\)\@<=.\+\_$\n/ contained nextgroup=pandocDelimitedCodeBlockCode_' . s:lang_syntax
  exe "syn match pandocDelimitedCodeBlockEnd_" . s:lang_syntax . ' /\(`\{3,}`*\|\~\{3,}\~*\)\(\_$\n\_$\)\@=/ contained containedin=pandocDelimitedCodeBlock_' . s:lang_syntax . ' conceal'

(basically, simply use the included syntax in the whole of pandocDelimitedCodeBlock__). The result is the same as in your screenshots.

I thought the generic rules could be interfering with the special ones, but they don't seem to.

I further simplified your code to

 unlet b:current_syntax
 exe "syn include @pandocDelimitedCodeBlock_" . s:lang_syntax . " syntax/" . s:lang_syntax . ".vim"
 exe "syn match pandocDelimitedCodeBlockCode_" . s:lang_syntax. ' /\(\_.\{-}\)\(\(`\{3,}`*\|\~\{3,}\~*\)\_$\n\_$\)\@=/ contained containedin=pandocDelimitedCodeBlock contains=@pandocDelimitedCodeBlock_' . s:lang_syntax

However, it seems that this rule actually overrides the generic ones (the cpp delimited code blocks in the cmake.markdown file get highlighted as pandocDelimitedCodeBlockCode_cmake.) Maybe the problem is in this rule.

blaenk commented 10 years ago

Regarding inconsistent highlighting, I think it may be the patterns I made. For example, given:

``` cmake
target_link_libraries (sometarget ${PACKAGENAME_LIBRARIES})


the `target_link_libraries` is not highlighted even though it's just a simple `syn keyword cmakeStatement` in the syntax file.
blaenk commented 10 years ago

The original pandocDelimitedCodeBlock_ rule included the language name as part of the pattern, this is what will allow us to have different language codeblocks once we implement that. I think that's why it's also highlighting the cpp blocks because it can't differentiate between the two.

Also notice in your first simplification, you reference pandocDelimitedCodeBlockCode_ from pandocDelimitedCodeBlockLanguage_ but it's not there, unless you forgot to paste it in as well.

By the way, you're doing these simplifications mainly to make it easier to find the problem? I had those different rules originally to be able to accomodate the different concealments.

fmoralesc commented 10 years ago

Not really. I thought it might be cleaner to reuse the generic delimited codeblock highlighting, and add special highlighting only for the region contained within the delimiters. What do you think of this approach? The simplifications I made were just proofs of concept. I am also a bit tired :p (at least, that's my excuse for leaving that reference to pandocDelimitedCodeBlockCode_, when in fact we no longer need it in that case)

fmoralesc commented 10 years ago

For reference (I'm off for the day, it's 2am already here): I was thinking of turning pandocDelimitedCodeBlockCode into a syn regionrule that matched the content between the delimiters if a language was specified. Sth like:

syn region pandocDelimitedCodeBlockCode_cmake start=/```\s*.*cmake.*\n/ end=/.\(```\)\@=/ contains=@CMAKE

(of course those regex won't work as is, this is just the gist of it).

blaenk commented 10 years ago

I'm not sure, I'll explain the rationale behind the different rules so that maybe you can better make that decision.

exe "syn region pandocDelimitedCodeBlock_" . s:lang_syntax . ' start=/^\z(\(\s\{4,}\)\=`\{3,}`*\)\s*' . s:lang_name . '/ end=/\z1`*/ skipnl contains=pandocDelimitedCodeBlockStart_' . s:lang_syntax . ' keepend'

This one is here for the start of the codeblock, notice it has the s:lang_name interpolated, that way we can have multiple codeblocks in the file, each different languages, all highlighted based on their language's syntax file. It uses \z cause that's what we do in the delimited codeblocks too (cause of strikeouts like you said in the comment).

exe "syn region pandocDelimitedCodeBlock_" . s:lang_syntax . ' start=/^\z(\(\s\{4,}\)\=\~\{3,}\~*\)\s*{\(.\+\s\)\?\.' . s:lang_name . '\(.\+\)\?}/ end=/\z1\~*/ skipnl contains=pandocDelimitedCodeBlockStart_' . s:lang_syntax . ' keepend'

This is the same exact thing but for tilde codeblocks (same thing happens here by the way).

exe "syn match pandocDelimitedCodeBlockStart_" . s:lang_syntax . ' /\(\_^\n\_^\(\s\{4,}\)\=\)\@<=\(\~\{3,}\~*\|`\{3,}`*\)/ contained nextgroup=pandocDelimitedCodeBlockLanguage_' . s:lang_syntax . ' conceal cchar=λ'

Needed a separate group only for the three backticks so that we can conceal those with the lambda.

exe "syn match pandocDelimitedCodeBlockLanguage_" . s:lang_syntax . ' /\(\s\?\)\@<=.\+\_$\n/ contained nextgroup=pandocDelimitedCodeBlockCode_' . s:lang_syntax

Needed a separate group for the part that follows the three backticks/tildes (the language/metadata) so that you can style it (like how you currently style it with Comment). This one took me a while to get right because otherwise this part of the codeblock would be highlighted by that language as well. This way we 'consume' this (and can style it) and so the code for the language highlighting itself starts on the first codeblock line, at least that's the intention.

exe "syn match pandocDelimitedCodeBlockCode_" . s:lang_syntax. ' /\(\_.\{-}\)\(\(`\{3,}`*\|\~\{3,}\~*\)\_$\n\_$\)\@=/ contained contains=@pandocDelimitedCodeBlock_' . s:lang_syntax

This is the tricky one that took me the longest. It tries to match every line within the codeblock (between the three backticks/tildes on top and bottom). Then only this content is given the contains=@language.

exe "syn match pandocDelimitedCodeBlockEnd_" . s:lang_syntax . ' /\(`\{3,}`*\|\~\{3,}\~*\)\(\_$\n\_$\)\@=/ contained containedin=pandocDelimitedCodeBlock_' . s:lang_syntax . ' conceal'

This is again needed so that we can specifically conceal the last backticks/tildes at the end of the codeblock, as we do in the general codeblock concealment.

Hope that sheds some light. I think the problem may have to do with pandocDelimitedCodeBlockCode_ because in the inconsistently highlighted examples, the unhighlighted parts do show up as pandocDelimitedCodeBlockCode_ group, when I would've figured the entire thing would be as if part of cmake (cause of the contains=@language).

Yeah I thought of making it a region actually, because I thought the problem lay in the mixture of syn match and contains=@lang. I'll try that.

blaenk commented 10 years ago

Alright I made a region out of pandocDelimitedCodeBlockCode_, check the branch. Now the delimiters are consistently concealed at least, but I'm still experiencing the same issues.

The region simply uses as 'start' the ^ directly after the opening delimiter, and the 'end' is the $ right before the closing delimiter.

By the way, I noticed that the syntax sync bug went away. I haven't tested it a lot, but I tested it a couple times on what previously consistently triggered the bug, and everything is perfectly smooth. I figure the region is more robust than the previous makeshift match.

I'm out of ideas for the cmake bug.

blaenk commented 10 years ago

By the way, I hope it's clear that it's not that I'm dying for cmake support, it's just that I want(ed) to be sure that this is the cmake syntax file's fault and not vim-pandoc-syntax, because if it was the latter, it might be indicative of other issues with other languages. The confusing thing of course is that every other language (that I've tested so far) works perfectly fine. Just tested with Ruby and it's perfectly fine.

fmoralesc commented 10 years ago

I just tried the approach I told you about last night (it works), and I found that somehow cmake's rules are not applied in the region. I think this is somehow cmake's syntax fault, but I'm not sure why. None of the cMakeStatement keywords containing underscores get highlighted.

I checked it using this rule:

 unlet b:current_syntax
 syn include @CMAKE syntax/cmake.vim
 syn region pandocDelimitedCodeBlock_cmake start=/\(^\z(\(\s\{4,}\)\=`\{3,}`*\).*cmake.*\n\)\@<=./ end=/\n\(\(`\{3,}`*\|\~\{3,}\~*\)\(\_$\n\_$\)\@=\)\@=/ skipnl contained containedin=pandocDelimitedCodeBlock contains=@CMAKE
 hi link pandocDelimitedCodeBlock_cmake Error

Basically, it highlights the whole block as Error, and then when cmakes highlighting gets applied one can see what parts of it are not highlighted:

captura de pantalla de 2013-11-14 11 36 15

Also, this rule exemplifies the approach I told you about last night. Instead of recreating the pandocDelimitedCodeBlocks rules for every language, we simply highlight the region contained inside the delimiters where the language id is specified. We can check that in the start= clause. I haven't tested this regex with may languages, but it is more or less the same thing you came up with.

fmoralesc commented 10 years ago

Was just trying out stuff, and I found what the problem is. It turns out that, since cmakeStatements are handled like keywords, the value of the local setting iskeywordhas to include _, otherwise keywords with underscores don't get highlighted. Normally, this wouldn't be an issue because to include _ is the default (neither vim-pandoc-syntax not vim-pantondoc change the value, since we don't use keywords at all). However, the LaTeX syntax (which we include) does, which I was unaware of. This was the root of this issue.

I'm not sure about what we can do about this. However, adding _ to iskeyword and linking the highlighting of the regions within the codeblock to pandocDelimitedCodeBlock has good results in my view:

captura de pantalla de 2013-11-14 12 03 50

fmoralesc commented 10 years ago

To be clear, I'm not sure if adding _ to iskeyword would have side-effects with LaTeX regions, so I'm kind of hesitant (not very, but still). Again, it would be helpful if one could apply this setting to only a part of the file.

Also: this might happen with other syntaxes. By going through the syntax files provided by vim with

grep -lr "set\(local\)\{,1\} isk" /usr/share/vim/vim74/syntax/ | wc -l

there's 78 syntaxes that somehow modify iskeyword.

fmoralesc commented 10 years ago

I just checked which syntaxes might have this issue. I created a file with the vanilla vim syntaxes that are supported by pandoc's highlighter (it's here), and got this:

> ~ grep -r "setlocal isk" /usr/share/vim/vim74/syntax/ | grep -f pandoc-langs
/usr/share/vim/vim74/syntax/nasm.vim:  setlocal iskeyword=@,48-57,#,$,.,?,@-@,_,~
/usr/share/vim/vim74/syntax/erlang.vim:  setlocal iskeyword+=$,@-@
/usr/share/vim/vim74/syntax/lisp.vim: setlocal iskeyword=38,42,43,45,47-58,60-62,64-90,97-122,_
/usr/share/vim/vim74/syntax/postscr.vim:  setlocal iskeyword=33-127,^(,^),^<,^>,^[,^],^{,^},^/,^%
/usr/share/vim/vim74/syntax/r.vim:setlocal iskeyword=@,48-57,_,.
/usr/share/vim/vim74/syntax/verilog.vim:   setlocal iskeyword=@,48-57,63,_,192-255
/usr/share/vim/vim74/syntax/scheme.vim:  setlocal iskeyword=33,35-39,42-58,60-90,94,95,97-122,126,_
/usr/share/vim/vim74/syntax/tex.vim: exe "setlocal isk=".g:tex_isk
/usr/share/vim/vim74/syntax/tex.vim: setlocal isk=48-57,a-z,A-Z,192-255
/usr/share/vim/vim74/syntax/tex.vim:  setlocal isk+=@-@
/usr/share/vim/vim74/syntax/lhaskell.vim:   setlocal isk+=_

We could have a iskeyword value that was the union of all these, or maybe of the most common. I worry most about tex, lisp and scheme's values.

blaenk commented 10 years ago

Did you push this somewhere? I'd like to try it. And I think I understand what you meant now about the simpler pattern.

I have a pretty math heavy post that I can test it on (so can you if you don't have a similar test document), to see how the modification might affect LaTeX.

fmoralesc commented 10 years ago

No, I haven't, I just wrote this proof of concept this morning and waited for your feedback on it. If you give me some time, I can flesh it out more and push it to some branch.

blaenk commented 10 years ago

No rush, was just making sure it wasn't just me who couldn't find the branch I thought you pushed it to.

My understanding is that isKeyword defines a pattern for keyword matching, so removing the underscore prevents cmake's syntax from parsing the entire keyword that contains underscores correct?, and this is a global option apparently, so it causes conflicts with embedded syntax highlighting.

Yeah I guess it'd be a matter of testing out the LaTeX and seeing if it messes up.

fmoralesc commented 10 years ago

Yes, you got it right.

Another side effect of iskeyword (I noticed it when checking the docs while building the quote conceals): it it used to determine word boundaries.

blaenk commented 10 years ago

So only certain languages modify isKeyword, since we know this list of languages, would it be bad to hardcode the list into the syntax file and only "fix" isKeyword if one of these languages is enabled? That way we don't mess with it most of the time, in case there are possible unforeseen issues that can arise.

That said, it really is a matter of testing it ourselves though, maybe there isn't much of an issue. When you get a chance to push the changes somewhere I'll test them out in my documents, which while not an exhaustive test, would at least give us an idea of any glaring issues.

Conversely, if there are major issues and we can't produce a single fix that works for many of the languages in conflict, then we could use such a hardcoded list as a _black_list to skip syntax-highlighting for them.

fmoralesc commented 10 years ago

I just pushed a more or less fleshed out implementation of this. Please check the codeblock-embedsbranch.

blaenk commented 10 years ago

It works on cmake, except for the original first bug which causes the ending marker to be considered part of the string. Check with the cmake test file for the first code embed:

``` cmake
option (USE_FFTW "use the fftw library" OFF)
fmoralesc commented 10 years ago

Yes, I also noticed cmake strings mess things up. I think that's an issue with cmake's syntax.

fmoralesc commented 10 years ago

Actually, cmakeString's rule is VERY naive (worse than our worst):

syn region cmakeString start=/"/ end=/"/ 
blaenk commented 10 years ago

This was one thing I did manage to fix in my version (though I agree I prefer yours since it's shorter), it's in my codeblock-highlight branch. I'll take a look at my patterns and compare when I get a chance.

By the way, you forgot to add support for the tilde codeblocks. One thing to keep in mind with tilde codeblocks is that they are of the form {.haskell}, but can also have arbitraray metadata, such as:

{.haskell caption="this instance of the word scala might throw it off"}

I think the current pattern would work in this case, since it'd match haskell and then the rest as .*, however, there's no rule on the order of the metadata, so it could also be like this:

{caption="this instance of the word scala might throw it off" .haskell}

That's why in my version I strictly checked for .language inside the {}'s. I think it might suffice to check loosely as is done right now when it concerns backticks, and otherwise just check if it's .language inside {}'s if it's tildes. What do you think?

blaenk commented 10 years ago

I pushed quick tilde codeblock support to my codeblock-embeds branch. This just makes it possible to match the tilde code blocks, but it doesn't yet enforce the stricter pattern discussed in the previous post.

blaenk commented 10 years ago

Actually, I apparently didn't fix it, it does highlight the remainder with the string group. What I did manage though was that the ending delimiter doesn't get highlighted/grouped as string, and so is concealed as it should be.

blaenk commented 10 years ago

Just a heads up, I don't know if you accidentally skipped my previous comment about black/white-listing the problem languages.

fmoralesc commented 10 years ago

No, I was going to say, it's a very good idea.

fmoralesc commented 10 years ago

On https://github.com/vim-pandoc/vim-pandoc-syntax/issues/14#issuecomment-28537803, I think it shouldn't be an issue. Both backticks and tildes versions accept the { ... }syntax, so they should be handled the same.

The example you give would only cause error if the user doesn't have haskell support enabled.

blaenk commented 10 years ago

Yeah, it did end up causing some minor issues with latex. Given the following:

expanded

it gets concealed like this:

concealed

instead of the correct way (without the isk fix):

working

I feel like these problem languages that meddle with isk are not worth the trouble. I wouldn't mind a blacklisting approach so that these don't get embedded-highlighted and just appear as all one color. On the other hand, maybe it'd be best to just leave it as is (don't treat it any differently; don't apply the isk fix). That is, if someone decides to enable/use cmake embedded highlighting they will see it warts and all, and it might encourage them to pressure others to get it fixed or something. We can avoid having these languages in the default language list, that way by default they appear all one color and it won't forcefully affect anyone. Then we can move on to other things.

About the error: I'm not sure what you mean, to give a better example, imagine I have:

``` {text="haskell is a better language" .scala}
def max(a: Int, y: Int): Int = if (x > y) x else y


This ends up getting highlighted as the language that gets highlighted last (appears last in the language list in the syntax file). So for example if I add scala into the language list (in the syntax file) after haskell, scala gets highlighted no matter which comes first in the metadata, and vice versa if I put it before haskell in the language list.

Pandoc doesn't care what order the metadata arguments are in. I think this is a stupid way to order them, I certainly don't do it this way, but it was to illustrate the point. Personally I couldn't care if we end up supporting this or not, it's just something to keep in mind though, that the order of the languages in the language list has an effect on how it gets highlighted.

If we highlighted more strictly (only `.language`) it would solve that problem for the most part. Again, this issue doesn't affect me, I was just bringing it up :)
blaenk commented 10 years ago

For what it's worth, my pattern(s) completely solved the issue of the syntax syncing. With this new pattern, I can consistently trigger it again :( I'll see if I can somehow get that fix into the new pattern, but considering I don't know what actually fixed it, it'll probably be tricky.

Also, can you find it in you to get rid of the highlight you added? ;) Here's what it looked like before:

before

and now:

after

Or at the very least add an option for us to change/disable it, since it's not something that will be easily patchable/modifiable in our own colorschemes since it's dynamically generated for every language.

fmoralesc commented 10 years ago

On that case, yes, you are right.

The problem with languages meddling with isk is the affected languages are the rest. In this case, the problem is LaTeX messes things up for cmake. However, we can't really drop LaTeX support... (actually, we should improve it).

Btw, I managed to fix the end delimiter issue in my rules.

fmoralesc commented 10 years ago

I'm not yet sure it's worth it to remove that highlight. What about changing that highlight for Special? Embedded code should, I think, be consistently distinguished form normal text.

fmoralesc commented 10 years ago

Also, to change it you shouldn't need to change every one of those, simply set the link for pandocDelimitedCodeBlock.

Anyway, I just pushed the change to Special to the codeblock-embeds branch.

blaenk commented 10 years ago

Oh okay it's latex causing the problem, I misunderstood. Yeah that's a shame. Can we add an option to disable latex embedding? I like it, but it's pretty inconsistent for the kind of math I use, with many missing characters. This way we can use the other highlighting without issue, if I understand correctly. I'd be content with at least latex inline embedding, but as I understand it any embedding at all triggers the issue.

Also, to change it you shouldn't need to change every one of those, simply set the link for pandocDelimitedCodeBlock.

This wouldn't work very well because I do agree that generic codeblocks should look different, it's just that when it concerns an embedded code block, the color + the syntax highlighting for that language don't mesh very well at all, in my opinion. When you have embedded codeblocks though, the syntax highlighting is already a giveaway that it's different/separate from the document, add to that the lambda and language above it and the gap at the bottom from end-delimiter-concealment and it stands out well enough without an extra color there.

If you're opposed to an option for disabling the highlight (or changing it), I suppose an alternative would be to define a new group only for codeblock embeds so that we can modify it in our colorschemes as we would pandocDelimitedCodeBlock.

blaenk commented 10 years ago

By the way, are you set on the recent change in concealment highlighting? I liked the way it was originally, it was blue in my colorscheme and complemented the things it was next to very well. For example, the dagger for footnotes would show up blue and the footnotes themselves would be green. Now they're both green in my colorscheme, heh.

My colorscheme actually defines the Conceal to be blue, but that recent change in pandoc.vim overrides it. Is this temporary? Or what option do we have for customizing it?