Add wsf lexer performance improvements used by vis

orbitalquark / scintillua

Scintillua enables Scintilla lexers to be written in Lua, particularly using LPeg. It can also be used as a standalone Lua library for syntax highlighting support.

https://orbitalquark.github.io/scintillua

MIT License

52 stars 22 forks source link

Add wsf lexer performance improvements used by vis #16

Closed moesasji closed 1 week ago

moesasji commented 3 years ago

This fix for performance improvements in the wsf lexer used by vis is committed in baa51e934ce057af5b5be829d6a73a3e8b4c03d0 and is the same as used in the xml lexer.

Would be good to add this to the upstream version

orbitalquark commented 3 years ago

This change does not correctly handle the following input:

<tag
  attr1=100
  attr2=100>

The "attr1=100" line will not be highlighted correctly, but the line below it will. This is why I had to use a function for look-behind.

Note that currently there is no performance penalty because that look-behind function is not called. ('=' is not highlighted at all, and "=[number]" sequences are still highlighted, but not necessarily in the context of attributes.)

moesasji commented 3 years ago

Thanks for taking a look and I indeed didn't spot this as vis is not yet fully compatible with the styles used in the new format. The strange thing is that the patch did work for the legacy format.

vis-legacy-wsf

Back to the drawing table then....

btw) some of these patches don't really show test-cases...so it being "performance" improvements comes from the original patches.

orbitalquark commented 3 years ago

It depends on how an application passes in text to lex. If they pass in the entire buffer each time, then there will not be an issue. (However, this will result in bad performance, as re-lexing the entire buffer on each change is unreasonable.) If they pass in a chunk of text (e.g. what's on the screen currently), then performance is better but there might be a few small issues lexing at the start. Scintillua's Scintilla lexer typically receives text from the beginning of the current line. So for the example I gave, when you are typing on the last line, Scintillua's lexer considers the two previous lines as correctly lexed.

I suspect vis is passing in text to lex in such a way that hides the issue Scintillua's Scintilla lexer would normally have. I hope this makes sense.

moesasji commented 3 years ago

Your explanation makes perfect sense. My understanding is that vis passes chunks of a file starting from a previous white line to maintain performance on large files, but does small files as the whole buffer based on this

For now best to just ignore this issue as your response gives a test-case to explore. Thanks!

mcepl commented 1 year ago

@moesasji vis has merged then upstream scintillua lexers in https://github.com/martanne/vis/commit/8a420ecc4c1ed50111464ec66901bd983eaf2dbd and we are now working on merging in all changes up to scintillua_6.2 tag in https://git.sr.ht/~mcepl/vis/log/devel_scintillua.

Do you think you would be able and willing to rebase all three of your PRs (#15, #16, and #17) on the top of scintillua_6.2, so that we all could benefit from it?

Thank you.

mcepl commented 4 weeks ago

Rebased in my devel branch on the top of the current master here as https://git.sr.ht/~mcepl/vis/commit/, specifically in 75ebca4e.

orbitalquark commented 1 week ago

Closing per comments in #118.