Closed gdkrmr closed 2 weeks ago
Yep. I was also able to reproduce the issue in neovim.
I wasn't able to reproduce this in any other file, so I am not sure what exactly causes the crash.
The file is quite big, so I'd assume that treesitter takes too much memory when updating and the whole process gets killed.
when the file is truncated to only 758 lines ( removed all the "empty" functions in the file) i see that the parsing fails
Yes. parsing fails because "euler$"
is not a valid string literal in Julia. The problem here is that tree-sitter's error recovery isn't activated.
The crash is happening because tree-sitter-julia's external scanner is not checking the size of the data that it's writing during serialization:
It needs to check that the size doesn't exceed TREE_SITTER_SERIALIZATION_BUFFER_SIZE
.
I opened a Tree-sitter PR to make this kind of error more obvious, and to protect against it when possible (when the grammar is loaded via wasm).
I see that the check is missing, but that doesn't seem to solve the underlying issue.
When the scanner sees the $
, it'll return a STRING_CONTENT
token and return to the parser. Then the parser will fail to match string_interpolation
. So far so good.
What happens after that is less clear. If I understand correctly, the parser doesn't return control to the scanner to match _string_end
, and now all the subsequent strings will fail because the stack was not cleared properly.
Is that what's causing the overflow?
If so, I didn't consider this type of delimiter mismatch when writing the scanner, so I'm open to suggestions on how to improve this.
can something be done as a temporary fix so that neovim doesnt crash ?
its not just the $ that triggers this problem ,
a "
in the buffer before a triple quoted string also triggers this.
for example the following ,
a = "
function f()
1
end
"""
hello world
"""
and on large files it crashes
is simply memcopying only upto TREE_SITTER_SERIALIZATION_BUFFER_SIZE if size exceeds it a workable fix for it not crashing ? the parsing failing still happens , but atleast a temporary fix to prevent editor crashes .
this is what i tried ...
diff --git a/src/scanner.c b/src/scanner.c
index f81d6f4..84164a9 100644
--- a/src/scanner.c
+++ b/src/scanner.c
@@ -59,8 +59,14 @@ void tree_sitter_julia_external_scanner_destroy(void *payload) {
unsigned tree_sitter_julia_external_scanner_serialize(void *payload, char *buffer) {
Stack *stack = payload;
unsigned size = stack->size;
- memcpy(buffer, stack->contents, size);
- return size;
+ if (size > TREE_SITTER_SERIALIZATION_BUFFER_SIZE){
+ memcpy(buffer, stack->contents,TREE_SITTER_SERIALIZATION_BUFFER_SIZE);
+ return TREE_SITTER_SERIALIZATION_BUFFER_SIZE;
+ }
+ else{
+ memcpy(buffer, stack->contents, size);
+ return size;
+ }
}
void tree_sitter_julia_external_scanner_deserialize(void *payload, const char *buffer, unsigned size) {
I thought I'd already pushed a fix for this bug, but apparently I didn't 😬
Anyways, the fix is what @ArbitRandomUser suggested. Highlighting will fail, but the parser won't crash.
I have a
.jl
File that reliably crashes myemacs
withjulia-ts-mode
andhelix
. I strongly suspect that tree sitter is to blame here, if I usejulia-mode
(the one without tree sitter), it works fine. The file is [1], to reproduce the crash, go to line 27 and insert$
at the end of"euler"
, trying to write"euler$i"
, the crash happens when$
gets inserted.change
to
I wasn't able to reproduce this in any other file, so I am not sure what exactly causes the crash.
[1] https://github.com/gdkrmr/Euler.jl/blob/crashes_treesitter/src/Euler.jl