latex-lsp / texlab

An implementation of the Language Server Protocol for LaTeX

GNU General Public License v3.0

1.55k stars 53 forks source link

Do not try to parse embedded (`asy`) code as TeX #490

Closed clason closed 2 years ago

clason commented 3 years ago

The texlab parser seems to try to parse embedded asy code as TeX, with the expected results:

\begin{asy}
    for(int i=0; i<m; ++i){
        real x=getx(i);
        if(i%3==0){
            draw((x, 0)--(x, 0.332));
        }else if(i%3==1){
            draw((x, 0.333)--(x, 0.665));
        }else{
            draw((x, 0.665)--(x, 1));
        }
    }
\end{asy}

The problem here is the i%3 -- the % is seen as a comment marker, even though asy is basically C code, so comments are marked with //.

I believe the best option is to just stop parsing known code environments (as verbatim already is?)

Off the top of my head, these would be

asy
asydef
luacode (for lualatex)
minted
...?

pfoerster commented 3 years ago

The texlab parser seems to try to parse embedded asy code as TeX

That's exactly what's happening at the moment. Simply stopping parsing is a bit difficult with our approach because there is no simple token that signals the end of the asy environment to the parser. However, I think we should disable the completion inside these environments and do not report diagnostics from these environments.

clason commented 3 years ago

That sounds reasonable. But wouldn't be enough (for this example) to just temporarily treat % as a regular character and not a comment char if you are in one of the verbatim/code environments?

foodornt commented 3 years ago

nothing new about this issue? pretty annoying when typing a tutorial about latex

pfoerster commented 3 years ago

But wouldn't be enough (for this example) to just temporarily treat % as a regular character and not a comment char if you are in one of the verbatim/code environments?

The problem is that the current lexer is very simple (auto-generated using the logos crate) and stateless so there is no easy way to change the comment character without replacing the lexer. However, I created #500, which should fix the issue. It basically prevents completion and hover support in verbatim environments for now. In the future, a better approach will be needed (likely, the tree-sitter grammar needs to be worked on and integrated).

clason commented 3 years ago

That approach makes sense; I don't think anything is lost by skipping over these environments.

500 does not seem to fix the issue for me, though?

pfoerster commented 3 years ago

500 does not seem to fix the issue for me, though?

Yeah, my bad. The diagnostics are still wrong for this case. Actually, the problem is much more difficult than I thought. Just not treating % as a comment character can also be wrong (consider string literals). I think language injections and tree-sitter seems the way to go. The current lexer/parser is a bit too limiting in this case and I think that migrating to tree-sitter is less effort. I also have new ideas regarding the performance issues I had in the past.

clason commented 3 years ago

Yeah, language injection is hard... (But excited to hear about performance ideas for the tree-sitter parser; from what I've seen, people are using it heavily with Neovim!)

clason commented 2 years ago

Now with the new tree-sitter based parser, is this now possible (excluding asy environments from being parsed as LaTeX)?

At least for highlighting, injecting a C parser for asy is working fine :)

pfoerster commented 2 years ago

Now with the new tree-sitter based parser, is this now possible (excluding asy environments from being parsed as LaTeX)?

Yeah, this is definitely possible now (the contents can be parsed as a comment). However, the parser can be more volatile in some cases (like here https://github.com/latex-lsp/tree-sitter-latex/pull/27#issuecomment-1040604767) but I hope that I can sort this one out.

Another problem at the moment is the performance regression with the parser. While incremental reparsing is very fast, traversing the tree is very slow (slower than parsing the entire document from scratch with the current parser). My previous strategy was to limit the depth by not traversing the text nodes, which cut the time in half. However, due to latest changes with https://github.com/latex-lsp/tree-sitter-latex/commit/1ea9f87d30df20e13cde292ff4d6c4d8dd979b16, this is not possible anymore so I have to try out something new.