Closed steffenhaug closed 1 day ago
I'm currently figuring out how to merge upstream/tree-sitter-julia. Their draft reference file added (macro_definition)
to our well-known list:
((string_literal) @comment.doc
.
[
(abstract_definition)
(assignment)
(const_statement)
(function_definition)
(macro_definition) ; NEW
(module_definition)
(struct_definition)
])
We'll need to add that in highlights.scm and injections.scm. You can test it locally if you like, otherwise I'll come back to this issue in the next few days.
Oh, your examples are macro calls so, the addition I mentioned won't probably work…
Yea the addition of (macro_definition)
doesn't fix this.
I'm guessing the Vim, Emacs, and VS Code plugin devs already have queries for most of these edge cases, so I'm all for not reinventing the wheel if that's possible. Although I guess a centralized set of queries hinges on the text editors naming the different highlights the same thing, and I don't know if that's the case.
Most of our queries come straight from nvim-tree-sitter, I made some additions. Your examples here and your merged PR haven't be covered yet.
There's also helix which is mostly aligned with nvim. Not sure about emacs. VSC uses regex for highlighting with its own issues (I had used it for a few years because of the julia extension but I'm a happy Zed convert now).
Although I guess a centralized set of queries hinges on the text editors naming the different highlights the same thing
There are only two capture names that we need to change. Nvim has a few more predicates for injections.
I have hit a little bit of a roadblock here, because (of all things) the Julia tree-sitter grammar appears to parse macro calls incorrectly.
It is my understanding that only the @doc
macro is handled as a special case in the parser to allow newlines between macro arguments, and that newlines are only allowed between the first and second argument. The advanced usage section of the documentation manual is pretty clear about this. Yet, it looks like the tree-sitter grammar allows newlines between arguments to any macro, and not even just between the first and second argument.
I can work around this, by adding a rule to highlight strings before macrocall
expressions inside macrocall
expressions, but that's quite gnarly. I guess I can implement it that way, document it as a workaround, and bug the tree-sitter-julia
people about it.
This behavior also breaks the queries in my last PR if there is a top-level macro before it.
bug the
tree-sitter-julia
people about it
That's the sane way to solve the issue. I bet the more complex our queries become, the more new issues slip our testing. And queries can't match whitespace which makes the workaround more challenging. Then one day with a new grammar, we have to clean up the workaround mess and practically start from scratch. I know because I went through this when contributing to the fish extension 😅.
One more thing (documenting here so I don't forget about it): we haven't covered @doc raw"""..."""
yet. That example is from the Advanced Usage you mentioned above and it's handy for docstrings containing LATEX or many $
signs. We'd need to match (prefixed_command_literal)
s beginning with raw"""
, similar to the queries in injections.scm.
bug the
tree-sitter-julia
people about itThat's the sane way to solve the issue.
Agreed. I'll file an issue if there isn't one already and reference ours.
I bet the more complex our queries become, the more new issues slip our testing. And queries can't match whitespace which makes the workaround more challenging. Then one day with a new grammar, we have to clean up the workaround mess and practically start from scratch. I know because I went through this when contributing to the fish extension 😅.
Yeah if we were to merge a workaround like this, I would hope we could delete it pretty quickly and not have crap like this hanging around for ages :-p In terms of cleaning up the mess, I don't really see the problem, because we just delete these queries when the grammar is fixed. To me, the question boils down to whether we want ugly code temporarily, or whether we want broken highlighting temporarily, and the answer might very well be that we would rather have broken highlighting. I'm not sure where we want to strike that balance.
As a general point I agree, that more complex queries are likely more sensitive to changes in the grammar, and more difficult to fix. I'm not sure how volatile the Julia grammar is? Maybe avoiding complex queries is warranted, but I think given how easy they are to write, I would personally lean more towards trying to mostly do correct highlighting, even if we need to do complex queries. I don't have as much experience with this as you probably do though, so I'm open to being wrong.
One more thing (documenting here so I don't forget about it): we haven't covered
@doc raw"""..."""
yet. That example is from the Advanced Usage you mentioned above and it's handy for docstrings containing LATEX or many$
signs. We'd need to match(prefixed_command_literal)
s beginning withraw"""
, similar to the queries in injections.scm.
That sounds pretty easy to do, I can make an issue for it now and whip up some queries for it when I have time. Apropos LaTeX, I would also like to see language injection in LaTeXStrings
, but I couldn't make this work as well as I wanted becasue the TeX parser isn't robust enough to handle the surrounding L"""
and """
. This would also require support in the grammar. I can make an issue for that as well.
I'm not sure where we want to strike that balance.
I'm not opinionated about it, a temporary fix is fine. A few things though:
I appreciate the time you spend on it and we can add more queries. Let's just test them thoroughly. We have that test-cases
directory now and we can add more files to it.
I found another missed opportunity for applying docstring syntax highlighting and Markdown language injection. Currently, definitions with macro annotations, such as
and
are not formatted as docstrings.
I'll make some queries for it and create a PR, but I'm making an issue to get some opinions on the following:
@kwdef
, or should we apply the highlighting to all strings before macro call expressions that contain struct- and function definitions?@foobar @boofar function f end
, but I'm not sure if it is possible to make a recursive query that identifies this for arbitrary levels of nesting. Should we ignore this (probably rare) edge case, or should we hand-write up to a reasonable level of nested macro calls?