marshallward / vim-restructuredtext

Syntax file for reStructuredText on Vim.
26 stars 12 forks source link

Suport language aliases when including syntaxes. #23

Closed other-mickk closed 7 years ago

other-mickk commented 7 years ago

I ran into a snag when I wanted to highlight C# blocks:

.. code-block:: C#

    var i = 42;

.. code-block:: cs

    var i = 42;

The first block uses a correct directive argument for the purposes of working with Pygments; but is not highlighted as C# in Vim. The contents of the second block are correctly highlighted as C# in vim (i.e. it uses syntax/cs.vim); but Pygments does not recognise 'cs' as a language. Even disregarding the fact that # is being used as the pattern delimiter the fundamental fact is that Vim and Pygments use different internal names for the same language.

This PR changes g:rst_syntax_code_list from a list to a dictionary of filetypes to alias patterns to bridge this gap, allowing the user to specify e.g.:

let g:rst_syntax_code_list = { 'cs': ['[Cc]#'] }

which will highlight any c# or C# block (which Pygments will understand) using syntax/cs.vim. Backward compatibility with lists using the previous format is maintained. I did not change the defaults too much, not adding or removing any language (e.g. no C# by default). However I did add '[Cc]++' as an alias to syntax/cpp.vim, as a nod to #20.

marshallward commented 7 years ago

This looks great, thanks very much for the extra effort you put into this! I probably won't be able to try this out for a few days (I'm on holiday) but I expect it will be fine.

The only comment I can think of (without trying it out, so I could be wrong) is whether there's some way to just make the whole thing case insensitive, rather than the case checks on the first letter (e.g. php and PHP are each probably more likely than Php). But even that can be sorted out later.

other-mickk commented 7 years ago

There is to my knowledge no way to make parts of a Vim pattern case-insensitive. There are \c and \C to make the whole pattern case-insensitive and case-sensitive respectively. I didn't want to change the whole behaviour hence why I settled on the manual [Cc]++ etc. patterns. The [Pp]hp one is to be consistent with the rest, and I'm just as unsatisfied with it as you are.

It should be possible to accept a pattern such as ab\w\\wc and turn it into [Aa][Bb]\w\\[Ww][Cc] but that seemed time-consuming and easy to get wrong, for not much pay-off.

I noticed something else however. What didn't occur to me at the time was to check the rST docs. It has this to say about directives:

[…] Directive types are case-insensitive single words […]

(N.b. directive types are in fact what appears before the two colons, e.g. the 'note' in .. note::.) In other words, we could make the whole pattern case-insensitive and leave the possibly user-supplied alias patterns as simple as they come, e.g. php. This would be a further change in behaviour, but one that should comply with rST tools. (On my end the Sphinx toolchain certainly happily accepts about all manners of .. code-BLOCK:: directives and so on.)

If you feel this is adequate I can amend the PR.

marshallward commented 7 years ago

Fully case-insensitive directives would be nice, but also happy to deal with that another day. I think this solves an important problem and is a good addition as it is.

I'll wait a few days before checking and merging this, so if you have time feel free to have a crack, but no pressure.

other-mickk commented 7 years ago

I have amended the PR. Just to be clear: this does not change how directives in general are handled. It's just the language-specific blocks that are case insensitive as a side-effect, and only those which languages are recognised from the g:rst_syntax_code_list:

.. code:: java

    this is highlighted as java

.. coDE:: JAva

    this as well
marshallward commented 7 years ago

Sorry that a few days turned into two weeks. But this all looks great, it works very well for me. Thanks very much for the work, it's a fantastic contribution for an irritating problem, and sets up the syntax file to easily solve future language name problems.