tree-sitter / tree-sitter-c

C grammar for tree-sitter
MIT License
225 stars 100 forks source link

bug: Parsing causes stack overflow #221

Closed kmod-midori closed 3 weeks ago

kmod-midori commented 3 weeks ago

Did you check existing issues?

Tree-Sitter CLI Version, if relevant (output of tree-sitter --version)

tree-sitter 0.22.6

Describe the bug

Trying to parse this file: https://github.com/AcademySoftwareFoundation/Imath/blob/main/src/Imath/toFloat.h causes stack overflow with the Rust binding, and generates extremely large log on CLI.

Steps To Reproduce/Bad Parse Tree

git clone https://github.com/tree-sitter/tree-sitter-c
cd tree-sitter-c
tree-sitter parse ../Imath/src/Imath/toFloat.h > 1.log

killall -9 tree-sitter

ls -lah 1.log         
-rw-rw-r-- 1 user user 1.2G Aug 16 09:45 1.log
# head -n 100 1.log
(translation_unit [0, 0] - [16398, 0]
  (comment [0, 0] - [0, 2])
  (comment [1, 0] - [1, 40])
  (comment [2, 0] - [2, 49])
  (comment [3, 0] - [3, 2])
  (comment [5, 0] - [5, 2])
  (comment [6, 0] - [6, 43])
  (comment [7, 0] - [7, 15])
  (comment [8, 0] - [8, 2])
  (comment [10, 0] - [10, 19])
  (ERROR [11, 0] - [16396, 2]
    (expression_statement [12, 5] - [16396, 2]
      (comma_expression [12, 5] - [16395, 57]
        left: (number_literal [12, 5] - [12, 15])
        (ERROR [12, 15] - [12, 16])
        (ERROR [12, 18] - [12, 19])
        right: (comma_expression [12, 19] - [16395, 57]
          left: (number_literal [12, 19] - [12, 29])
          (ERROR [12, 29] - [12, 30])
          (ERROR [12, 32] - [12, 33])
          right: (comma_expression [12, 33] - [16395, 57]
            left: (number_literal [12, 33] - [12, 43])
            (ERROR [12, 43] - [12, 44])
            (ERROR [12, 46] - [12, 47])
            right: (comma_expression [12, 47] - [16395, 57]
              left: (number_literal [12, 47] - [12, 57])
              (ERROR [12, 57] - [12, 58])
              (ERROR [13, 4] - [13, 5])
              right: (comma_expression [13, 5] - [16395, 57]
                left: (number_literal [13, 5] - [13, 15])
                (ERROR [13, 15] - [13, 16])
                (ERROR [13, 18] - [13, 19])
                right: (comma_expression [13, 19] - [16395, 57]
                  left: (number_literal [13, 19] - [13, 29])
                  (ERROR [13, 29] - [13, 30])
                  (ERROR [13, 32] - [13, 33])
                  right: (comma_expression [13, 33] - [16395, 57]
                    left: (number_literal [13, 33] - [13, 43])
                    (ERROR [13, 43] - [13, 44])
                    (ERROR [13, 46] - [13, 47])
                    right: (comma_expression [13, 47] - [16395, 57]
                      left: (number_literal [13, 47] - [13, 57])
                      (ERROR [13, 57] - [13, 58])
                      (ERROR [14, 4] - [14, 5])
                      right: (comma_expression [14, 5] - [16395, 57]
                        left: (number_literal [14, 5] - [14, 15])
                        (ERROR [14, 15] - [14, 16])
                        (ERROR [14, 18] - [14, 19])
                        right: (comma_expression [14, 19] - [16395, 57]
                          left: (number_literal [14, 19] - [14, 29])
                          (ERROR [14, 29] - [14, 30])
                          (ERROR [14, 32] - [14, 33])
                          right: (comma_expression [14, 33] - [16395, 57]
                            left: (number_literal [14, 33] - [14, 43])
                            (ERROR [14, 43] - [14, 44])
                            (ERROR [14, 46] - [14, 47])
                            right: (comma_expression [14, 47] - [16395, 57]
                              left: (number_literal [14, 47] - [14, 57])
                              (ERROR [14, 57] - [14, 58])
                              (ERROR [15, 4] - [15, 5])
                              right: (comma_expression [15, 5] - [16395, 57]
                                left: (number_literal [15, 5] - [15, 15])
                                (ERROR [15, 15] - [15, 16])
                                (ERROR [15, 18] - [15, 19])
                                right: (comma_expression [15, 19] - [16395, 57]
                                  left: (number_literal [15, 19] - [15, 29])
                                  (ERROR [15, 29] - [15, 30])
                                  (ERROR [15, 32] - [15, 33])
                                  right: (comma_expression [15, 33] - [16395, 57]
                                    left: (number_literal [15, 33] - [15, 43])
                                    (ERROR [15, 43] - [15, 44])
                                    (ERROR [15, 46] - [15, 47])
                                    right: (comma_expression [15, 47] - [16395, 57]
                                      left: (number_literal [15, 47] - [15, 57])
                                      (ERROR [15, 57] - [15, 58])
                                      (ERROR [16, 4] - [16, 5])
                                      right: (comma_expression [16, 5] - [16395, 57]
                                        left: (number_literal [16, 5] - [16, 15])
                                        (ERROR [16, 15] - [16, 16])
                                        (ERROR [16, 18] - [16, 19])
                                        right: (comma_expression [16, 19] - [16395, 57]
                                          left: (number_literal [16, 19] - [16, 29])
                                          (ERROR [16, 29] - [16, 30])
                                          (ERROR [16, 32] - [16, 33])
                                          right: (comma_expression [16, 33] - [16395, 57]
                                            left: (number_literal [16, 33] - [16, 43])
                                            (ERROR [16, 43] - [16, 44])
                                            (ERROR [16, 46] - [16, 47])
                                            right: (comma_expression [16, 47] - [16395, 57]
                                              left: (number_literal [16, 47] - [16, 57])
                                              (ERROR [16, 57] - [16, 58])
                                              (ERROR [17, 4] - [17, 5])
                                              right: (comma_expression [17, 5] - [16395, 57]
                                                left: (number_literal [17, 5] - [17, 15])
                                                (ERROR [17, 15] - [17, 16])
                                                (ERROR [17, 18] - [17, 19])
                                                right: (comma_expression [17, 19] - [16395, 57]
                                                  left: (number_literal [17, 19] - [17, 29])
                                                  (ERROR [17, 29] - [17, 30])
                                                  (ERROR [17, 32] - [17, 33])

Expected Behavior/Parse Tree

This at least should produce a flat AST, instead of being recursive.

Repro

No response

amaanq commented 3 weeks ago

It is not valid C code, hence the error recovery kicking in by attempting to parse it as comma expressions (albeit not a great recovery).