Syntax improvements and support for math environments

GoogleCodeExporter commented 8 years ago

What steps will reproduce the problem?

I am too lazy to go over issues other people reported, or requests for 
enhancements that are related to the fragility of current syntax and its 
implementation. Main problems: 
 - often not possible to nest environments
 - often not possible to escape certain characters that are used as tags
 - hard to predict which environment will "win" if many tags are present on a line
 - quite easy to get environments "crossed" (and generate broken HTML) 
 - difficult to extend the syntax without introducing many more problems of the above types
 - not really possible to use the known Javascript solutions for including mathematics in wiki files without a minimal syntactic support from vimwiki

To make the syntax and behaviour predictable, easy to maintain and extend, the 
syntax should rely on `syntax region` instead on `syntax match` for most 
environments: this is the only practical way to avoid most of the unpleasant 
problems mentioned above.

What version of the product are you using? On what operating system?
version 1.1.1
let g:vimwiki_camel_case = 0

Please provide any additional information below.
I have tried to address some of the problems by changing the files

syntax/vimwiki.vim, syntax/vimwiki_default.vim

Headers, text styles and code/pre regions are defined, as well as $math$ and 
\[display math\], so that it is possible to nest the regions where nesting 
seems sensible. No assumption on what is inside the math environments is made 
(except that \ and $ are escaped by prepending \).

I have more or less not touched links, lists and tables, since I assume a lot 
of vimwiki code deals with these and no one would like to see that broken. But 
they clearly need the same treatment if vimwiki is to become more robust and 
flexible.

In testing it on my wiki files, this new highlighting does not seem to break 
anything (but I do not have many tables, and do not use CamelCase auto-links). 
With a complicated markup, these new syntax files are a huge improvement. With 
a rather chaotic markup, vim does not seem to always work perfectly (or perhaps 
I do not know how to fine-tune the region's definition), but it is a world of 
difference already compared to the current (ver. 1.1.1) situation.

Attaching four files:
# vimwiki01.vim             (this can replace  syntax/vimwiki.vim)
# vimwiki_default01.vim     (this can replace  syntax/vimwiki_default.vim)
# QuickTest.wiki            (a small test file)
# QuickTest.png             (two views of the test file in vim)
One view shows the full entered text (conceallevel=0), the other view shows 
what conceallevel=2 looks like when a few more `conceal cchar=..` syntax 
matches are loaded (these are not included in vimwiki01.vim, as this is by far 
not the most important thing vimwiki needs right now).    

Just to be clear: these files only change what it looks like in Vim. Vimwiki 
will use its own "syntax" when doing :Vimwiki2HTML, so you will *not* get what 
you see unless you use a very simple markup only.
(that is also why the attached syntax files are so large: I had left most of 
the original regular expression stuff there so that Vimwiki can still function).

Is there any willingness to make some pretty significant changes to the Vimwiki 
code? In the end, making Vimwiki's "parsing" behave like Vim's region parsing 
would make it incredibly simple to support different syntaxes and extending the 
current one(s) to support other people's needs.
It would not surprize me if it actually led to a noticeable reduction in the 
code size (I mean for a similar feature set).

By the way, supporting math is a pretty important issue, it is nowadays pretty 
much a standard feature for wikis. There is no comparison between a math 
environment and "features" such as underlined, deleted or whatever text. 
Dealing with the actual formulas is outsourced to some local Javascript or a 
server script (that is, other developers), we just need to be able to mark up 
something as math. Our aim should be MathJax  http://www.mathjax.org, it is 
apparently quite flexible, so people using math lightly may get away with a 
very simple setup.

Original issue reported on code.google.com by tpospi...@gmail.com on 26 Nov 2010 at 1:08

Attachments:

GoogleCodeExporter commented 8 years ago

wow!

I really appreciate what you have done so far. I will check the code (not sure 
when - I do have quite a heavy load of stuff to do at my current job) and try 
to give a feedback.

And yes, I am not against any significant changes to vimwiki's code. 

BTW, I'd like :Vimwiki2HTML to be consistent with vimwiki's default syntax as 
much as possible.

Original comment by habamax on 26 Nov 2010 at 2:11

Changed state: Accepted

GoogleCodeExporter commented 8 years ago

Original comment by habamax on 26 Nov 2010 at 2:12

Added labels: Type-Enhancement
Removed labels: Type-Defect

GoogleCodeExporter commented 8 years ago

Would you like to have committer role?

It would be nice if there was a branch with all of your changes we could merge 
when it is time.

Original comment by habamax on 26 Nov 2010 at 2:21

GoogleCodeExporter commented 8 years ago

The 'code' I added is really simple, but very hard to read due to the fact all 
the regexes are in a different file (I'd estimate that if written as a normal 
syntax file, it would be perhaps 50 lines, plus about the same or a little less 
for the things I have not done yet: lists etc.). I just tried to follow the 
idea that a perhaps in the future the same syntax file could work with slightly 
different regexes as well.

But I would like to take that much further, and make a translator to HTML that 
would work without using not much other information but the syntax file and a 
little extra information, this would make changes and some feature additions 
very simple (for instance, adding math support would be 10 lines of extra code 
and then just to write up documentation for users of vimwiki how to configure 
suitable Javascript). Here is the essence):
 - for the translation to HTML (or other formats in the future), vimwiki code should focus on correctly recognizing the main block environments; all the inline ones (that are presently limited to be on one line of the wiki source) could be handled using vim's `synstack()` and `synID()` functions (instead of a whole lot of matching that seems to have lots of hardcoded regexes, and not just the ones from the syntax file, and that attempts to replicate what vim's syntax matching can do based on the syntax file); perhaps one should try to make it match all environments, block or inline. So what saying: any explicit matching should be done just as an optimization, to skip over a text that cannot introduce a new (or terminate currently active environment) tag, in order to reduce the amount of calls to synstack and synID.
 - acually, it would not be the first time something of this sort has been done: see the vimscript `syntax/2html` <!--(maybe `matchit.vim`)-->: from about the line 820 (before this line, it mostly deals with folds and colors); the main loop is about 100 lines of code, a lot of which deals with diff and virtual columns and other options and settings Vimwiki2HTML does not care about
 - according to the official documentation, `:TOhtml` is "very slow", because it (among other things) walks over all characters one-by-one and calls `synID` on each of them to see if it has changed; in a vimwiki, inside most environments, one can safely skip over all "nonspecial" characters... well, if it wasn't for such silly things as `VimwikiTodo` or URLs or automatic CamelCase links (these are probably the only real problem, the previous ones are caught because of :, not to mentions the current `<blockquote>` mechanism that does not seem to have any vim syntax marking at all), which would cut down on the number of (presumably expensive) `synID` calls.
- running on a test file, it turns out Vimwiki2HTML could be 50 or 80% slower 
than TOhtml, when I simply run synID for each character of the file, we get a 
tiny fraction of translator's times (perhaps 10%), so it seems that this should 
be really the way to go. If this can be made to work, and I thing it can, 
adding a support for another (not too different) wiki syntax would be a matter 
of minutes (plus hours of testing). Of course, tables or lists that have a lot 
of special handling while writing are another story.

Original comment by tpospi...@gmail.com on 26 Nov 2010 at 4:32

GoogleCodeExporter commented 8 years ago

About the commiter role: right now I cannot see how much time I will have to 
play with this. If I manage to knock up a short, more-or-less working HTML 
translator, I will let you know.

But making a branch that could later be "merged": I do not see that as too 
promising, because I am just trying to make a more solid foundation for the 
project, I will almost certainly never have time to learn the workings of the 
current code (4000 lines?) and trying to work with that: it would be much 
easier for someone who already knows the code to start moving it in the more 
promising direction, which is what I am proposing. Foundation would be hard to 
"merge" to an existing project, the project has to be moved onto the foundation.

I will let you know if I have a working germ of a translator, I am pretty sure 
that it is possible, but I want it much shorter and easier to than over 1000 
lines of code the syntax/2html or Vimwiki2HTML have. I am just a little worried 
about some of the current syntax rules that are not very well thought out: 
these design issues could result in a very large penalty in terms of code 
complexity, so if there is going to be no willingness at all for adjusting the 
worst issues with the current syntax, I may run out of time before it is in a 
useable form for general use. With a "rational" wiki syntax, I'd estimate that 
150 lines of vim code could easily generate the HTML tag tree, but I do not 
know how much it takes to correctly escape all the characters that need it... I 
am afraid there may be a lot of cases that require special handling (since HTML 
is not quite something that would pass as a rational design either).

I am attaching the QuickTest.png that is much better cropped, I had no idea the 
original one I posted was double the size. Should I delete the old one? It 
seems I could, but cannot change the attachment in the original posting.

Original comment by tpospi...@gmail.com on 26 Nov 2010 at 5:07

Attachments:

QuickTest.png

GoogleCodeExporter commented 8 years ago

Issue 170 has been merged into this issue.

Original comment by habamax on 16 Jan 2011 at 8:38

GoogleCodeExporter commented 8 years ago

We'll reopen it if needed with issues specified.

Original comment by habamax on 4 May 2012 at 3:38

Changed state: Done

agelessdummy / vimwiki

Syntax improvements and support for math environments #147