kdl-org / kdl

the kdl document language specifications
https://kdl.dev
Other
1.09k stars 61 forks source link

Add tests for syntax highlighting #325

Open eugenesvk opened 1 year ago

eugenesvk commented 1 year ago

I've noticed that you have quite a few tests for parsers to test how they conform to the KDL specification, but haven't noticed any such tests for syntax highlighting packages.

Since there is not a uniform way to automatically test this unlike with the parses (where you could feed a file and compare output to a reference file), it would be nice to be have a single KDL document with various tricky elements in it that would allow you to quickly test a given syntax highlighting scheme against

I've added one using the Sublime Text's format https://github.com/eugenesvk/sublime-KDL/blob/main/test/syntax_test_kdl.kdl that shows what each token means, and also a brief KDL document https://github.com/eugenesvk/sublime-KDL/blob/main/test/syntax_example_screen.kdl where you could do a quick visual check (especially paired with a "reference" image) KDL syntax screenshot default

But that's not a very comprehensive document, not sure how many tricky spec parts are covered, that would require more familiarity with the spec

(by the way, maybe one of the parses that already implemented KDL can be fed a KDL source file and spit correct element names? This won't help much in visual feedback, but still be valuable to check syntax highlighter)

tabatkins commented 1 year ago

I'm not sure that's really something that we should lock down to the extent that it's testable. Just off the top of my head, a few things that can very reasonably vary between highlighters and due to personal preference:

If we abstract above these questions, we end up essentially just testing basic parsing, which we're already doing.

eugenesvk commented 1 year ago
  • Whether the interior of comments is colored the same as the comment glyphs themselves.

But I'm not talking about colors, that's fully up to the user's color theme!

In my example colors is just something that helps visually identify how a given token is parsed. I don't even mean that you should lock down the specific syntax names in the tests

So the test would just show that something is parsed as a node or as a comment or as comment separator rather than as part of a string etc. (check out the sublime text's test file linked to see

/*!Doc block comment /* nested block comment */*/
//                   ^^                              punctuation.definition.comment.begin.kdl
//                   ^^^^^^^^^^^^^^^^^^^^^^^^^^      comment.block.kdl

For example, I was checking VSCode plugin and noticed it had a rule that disallowed spaces before node name (/-nodeNestedSlashdashed in the example above), and per I think grammar it should be allowed, so I was confused as to what is right and whether you can have slashdashed comment only right at the beginning of a line or not

How, I did check a few parsing tests which you already do, but they're not suitable for the job since they're split into a gazillion of files, so you can't use them to test a theme, it lacks per-token info you could use to automate, and for manual checking it should be consolidated

But then if you had a condensed but complicated test file that contains many various elements with tricky interplay for elements - you could see immediately some mistakes, e.g. if you nested nodes aren't parsed as nested

eugenesvk commented 1 year ago

essentially just testing basic parsing, which we're already doing

but not in a way that is usable for a syntax highlighter!

tabatkins commented 1 year ago

But I'm not talking about colors, that's fully up to the user's color theme!

I wasn't talking about colors either, all of my questions are about how things are segmented and separated (or not). All of those questions can be reasonably decided different by syntax highlighters, regardless of what colors they use or what names they use for their categories.

but not in a way that is usable for a syntax highlighter!

Granted, but as I said, syntax highlighters can do things way different and that sort of divergence is acceptable. I'm also not really sure how you would test a syntax highlighter in an automated fashion?

eugenesvk commented 1 year ago

All of those questions can be reasonably decided different by syntax highlighters, regardless of what colors they use or what names they use for their categories

But this doesn't prevent you from testing. For example, if you decide to go very broad and tokenize string the same as string quotes, and my highlighter correctly marks quotes as punctuation, I could still use your tests, the quotes will be "additive" (they'll simply not be tested), but the test would help catch other mistakes The opposite wouldn't work with automated tests as those would fail, but in a visual reference check you could simply ignore the quotes if you highlighter doesn't support that

that sort of divergence is acceptable

Not if that divergence contradicts the grammar by not marking a number as a string

not really sure how you would test a syntax highlighter in an automated fashion?

That's possible with Sublime syntax, but that's not necessary - my suggestion was to have a dense document with various cases and self-describing names just like in some of the examples in the readme

// This entire node and its children are all commented out.
/-mynode "foo" key=1 {
  a
  b
  c
}
tabatkins commented 1 year ago

Ohhhh, if you just want a document that's has tricky cases and can be verified by inspection, then that's totally reasonable.

eugenesvk commented 1 year ago

Maybe just concatenating all the parser "from" test files and visually checking that it does NOT have any red invalid marks could be one good test for whether a highlighter rejects any valid constructs

That would leave the second half of checking whether the valid elements are correctly highlighted