camdencheek / tree-sitter-dockerfile

A tree-sitter grammar for Dockerfile
MIT License
71 stars 20 forks source link

Segfault when file doesn't end in newline #22

Closed kopecs closed 4 months ago

kopecs commented 2 years ago

Attempting to parse the file

LABEL A=$B

where the file contents do not contain a newline results in a segfault. If I run with -d, I get the following:

trace

``` new_parse process version:0, version_count:1, state:1, row:0, col:0 lex_internal state:157, row:0, column:0 consume character:'L' consume character:'A' consume character:'B' consume character:'E' consume character:'L' lexed_lookahead sym:LABEL, size:5 shift state:156 process version:0, version_count:1, state:156, row:0, col:5 lex_internal state:48, row:0, column:5 skip character:' ' consume character:'A' lexed_lookahead sym:unquoted_string, size:2 shift state:234 process version:0, version_count:1, state:234, row:0, col:7 lex_internal state:157, row:0, column:7 consume character:'=' lexed_lookahead sym:=, size:1 shift state:15 process version:0, version_count:1, state:15, row:0, col:8 lex_internal state:25, row:0, column:8 consume character:'$' lexed_lookahead sym:$, size:1 shift state:140 process version:0, version_count:1, state:140, row:0, col:9 lex_internal state:44, row:0, column:9 consume character:'B' lexed_lookahead sym:variable, size:1 shift state:82 process version:0, version_count:1, state:82, row:0, col:10 lex_internal state:13, row:0, column:10 lex_internal state:0, row:0, column:10 lexed_lookahead sym:end, size:0 detect_error resume version:0 recover_with_missing symbol: , state:5 recover_eof select_smaller_error symbol:ERROR, over_symbol:ERROR select_smaller_error symbol:ERROR, over_symbol:ERROR select_smaller_error symbol:ERROR, over_symbol:ERROR select_smaller_error symbol:ERROR, over_symbol:ERROR select_smaller_error symbol:ERROR, over_symbol:ERROR select_smaller_error symbol:ERROR, over_symbol:ERROR process version:1, version_count:11, state:303, row:0, col:10 no_lookahead_after_non_terminal_extra reduce sym:line_continuation, child_count:1 ```

which seems to suggest an issue in error recovery.

On x86_64, I get a parse error instead of a segfault.

Environment:


This is maybe better suited as an issue in https://github.com/tree-sitter/tree-sitter/, so let me know if this should just be migrated there.

camdencheek commented 2 years ago

Hi @kopecs! Thanks for the report.

When you say "attempting to parse the file", how exactly are you doing that? When I run tree-sitter parse on a file with that content, I can reproduce the error, but I'm unable to reproduce the segfault. I'm also using tree-sitter 0.20.6 on a aarch64 mac.

kopecs commented 2 years ago

I'm also running tree-sitter parse. Running the following reproduces this for both myself and a coworker:

git clone git@github.com:camdencheek/tree-sitter-dockerfile.git \
    && cd tree-sitter-dockerfile \
    && printf 'LABEL A=$B' > Dockerfile \
    && tree-sitter generate \
    && tree-sitter parse Dockerfile

In case it's helpful, my C compiler is

Apple clang version 13.1.6 (clang-1316.0.21.2.5)
Target: arm64-apple-darwin21.5.0
Thread model: posix
camdencheek commented 2 years ago

Oh, sure enough, a fresh clone reproduced this. Turns out my main branch wasn't up to date 🤦

The "doesn't handle newlines" part of this is almost definitely a bug in the Dockerfile grammar, but tree-sitter probably shouldn't be segfaulting. I opened an issue upstream with your reproduction steps, but I'm going to leave this issue open until I get around to fixing the newline handling causing the parse error.

camdencheek commented 4 months ago

Closing since the upstream issue has been closed and I can no longer reproduce this with the latest version of treesitter