Closed kopecs closed 4 months ago
Hi @kopecs! Thanks for the report.
When you say "attempting to parse the file", how exactly are you doing that? When I run tree-sitter parse
on a file with that content, I can reproduce the error, but I'm unable to reproduce the segfault. I'm also using tree-sitter 0.20.6 on a aarch64 mac.
I'm also running tree-sitter parse
. Running the following reproduces this for both myself and a coworker:
git clone git@github.com:camdencheek/tree-sitter-dockerfile.git \
&& cd tree-sitter-dockerfile \
&& printf 'LABEL A=$B' > Dockerfile \
&& tree-sitter generate \
&& tree-sitter parse Dockerfile
In case it's helpful, my C compiler is
Apple clang version 13.1.6 (clang-1316.0.21.2.5)
Target: arm64-apple-darwin21.5.0
Thread model: posix
Oh, sure enough, a fresh clone reproduced this. Turns out my main
branch wasn't up to date 🤦
The "doesn't handle newlines" part of this is almost definitely a bug in the Dockerfile grammar, but tree-sitter probably shouldn't be segfaulting. I opened an issue upstream with your reproduction steps, but I'm going to leave this issue open until I get around to fixing the newline handling causing the parse error.
Closing since the upstream issue has been closed and I can no longer reproduce this with the latest version of treesitter
Attempting to parse the file
where the file contents do not contain a newline results in a segfault. If I run with
-d
, I get the following:trace
``` new_parse process version:0, version_count:1, state:1, row:0, col:0 lex_internal state:157, row:0, column:0 consume character:'L' consume character:'A' consume character:'B' consume character:'E' consume character:'L' lexed_lookahead sym:LABEL, size:5 shift state:156 process version:0, version_count:1, state:156, row:0, col:5 lex_internal state:48, row:0, column:5 skip character:' ' consume character:'A' lexed_lookahead sym:unquoted_string, size:2 shift state:234 process version:0, version_count:1, state:234, row:0, col:7 lex_internal state:157, row:0, column:7 consume character:'=' lexed_lookahead sym:=, size:1 shift state:15 process version:0, version_count:1, state:15, row:0, col:8 lex_internal state:25, row:0, column:8 consume character:'$' lexed_lookahead sym:$, size:1 shift state:140 process version:0, version_count:1, state:140, row:0, col:9 lex_internal state:44, row:0, column:9 consume character:'B' lexed_lookahead sym:variable, size:1 shift state:82 process version:0, version_count:1, state:82, row:0, col:10 lex_internal state:13, row:0, column:10 lex_internal state:0, row:0, column:10 lexed_lookahead sym:end, size:0 detect_error resume version:0 recover_with_missing symbol: , state:5 recover_eof select_smaller_error symbol:ERROR, over_symbol:ERROR select_smaller_error symbol:ERROR, over_symbol:ERROR select_smaller_error symbol:ERROR, over_symbol:ERROR select_smaller_error symbol:ERROR, over_symbol:ERROR select_smaller_error symbol:ERROR, over_symbol:ERROR select_smaller_error symbol:ERROR, over_symbol:ERROR process version:1, version_count:11, state:303, row:0, col:10 no_lookahead_after_non_terminal_extra reduce sym:line_continuation, child_count:1 ```
which seems to suggest an issue in error recovery.
On x86_64, I get a parse error instead of a segfault.
Environment:
This is maybe better suited as an issue in https://github.com/tree-sitter/tree-sitter/, so let me know if this should just be migrated there.