tree-sitter / tree-sitter-haskell

Haskell grammar for tree-sitter.
MIT License
157 stars 36 forks source link

remove code duplication #9

Closed clojj closed 6 years ago

clojj commented 6 years ago

all tests are green

clojj commented 6 years ago

what about 'parser.c'... isn't it generated and should be in .gitignore ? (maybe you can resolve this by ignoring my version of parser.c ?)

clojj commented 6 years ago

didn't use any C in a while ;) for (const char c = match; c; c++) { ...

...nice optimization

clojj commented 6 years ago

not exactly less code now... but fewer duplications

maxbrunsfeld commented 6 years ago

@clojj Even though parser.c is generated, we check it into the repository for convenience. It allows people to use these parsers from any language without having to fetch them from npmjs.com or generating them using tree-sitter-cli.

I'm surprised you're getting merge conflicts - why do you have changes to parser.c? I thought you were just refactoring scanner.cc.

clojj commented 6 years ago

I just rebuilt everything for testing.. did not modify the grammar. Okay.. so I do know about parser.c now

Better create a new PR ?

maxbrunsfeld commented 6 years ago

No, let's just keep the discussion on this PR. No need to create a new one.

clojj commented 6 years ago

so yes, the only change is in scanner.cc I'm myself not yet very happy about this one advance_matching_chars_ends_not_by()

...but the others improve on in making the scanner's logic more comprehensible I think.

rewinfrey commented 6 years ago

@clojj thanks for taking a pass at this! The external scanner can be improved, and you've identified one way that can help it.

Over the past week I've had to make changes to the external scanner as I've added support for nested let expressions, and that prompted larger changes to the scanner. I think those changes (which are in part inspired by your work here), unfortunately subsume this effort. I opted for a simpler abstraction called isolated_sequence in the scanner that only checks for the case when a sequence of characters is followed by a whitespace (or end of file).

Thank you for your efforts here! I'm going to close this without merge as the scanner today has diverged enough, and I think the lighter weight single abstraction is all we need at this time.