BdR76 / CSVLint

CSV Lint plug-in for Notepad++ for syntax highlighting, csv validation, automatic column and datatype detecting, fixed width datasets, change datetime format, decimal separator, sort data, count unique values, convert to xml, json, sql etc. A plugin for data cleaning and working with messy data files.
GNU General Public License v3.0
151 stars 8 forks source link

Preserve the state of quoted multi-line fields #59

Closed rdipardo closed 1 year ago

rdipardo commented 1 year ago

Fixes #53

patch-demo

How it works

IDocumentVtable.GetLineState/SetLineState

As I briefly explained on the forum, Scintilla has an API for preserving state across lines. The relevant methods were already defined on the IDocumentVtable struct. This patch employs them in the "conventional" way (copied from this other Scintilla-based CSV lexer, in C++):

  1. Check for a pre-existing line state at the top of the function call. In our case, the state we're looking for is a color index. We set the color to that index if it's there and valid.

  2. At the end of every line, check if we're inside an open-ended quoted field. If yes, preserve the active color so it carries over to the next line; if no, zero-out the line state.

What's Changed

Up to now, the logic for detecting line ends was mutually exclusive with multi-line quoted fields — i.e., the check for an EOL was guarded by the !quote condition, so the only way to find the EOL was to not be inside a quoted field.

This had the accidental effect of coloring the EOLs inside quoted fields:

csvlint-rel-ver-colored-eols

To make line state work, we need to know where the EOL is at all times, even inside a quoted field. This patch makes EOL detection independent of the !quote condition. Because we also don't what to blindly switch styles when we're inside a trailing quoted field, the color index change is now guarded by !quote.

If there's any reason to keep coloring EOLs inside multi-line quoted fields, that would be a blocker for this PR in it's current form. Otherwise, it should be good to go, after a few people have the chance to test it.

Note that the earlier attempted fix in e9d1074 has not been removed. Leaving it in does not seem to hurt; there may be edge cases where it still comes in handy.


P.S.

Just noticed that the color index is now incremented inside a conditional. It should work the same as before because post-fix operators have the highest precdence in C-like languages.

BdR76 commented 1 year ago

Thanks for the PR, much appreciated. 😀 Works like a charm. Btw I see you're using LineFromPosition, I'm also looking for a way to use the PositionFromLine method, to implement the correct syntax highlighting for the SkipLine feature.

(also, with which tool did you create the animated screenshot?)