tree-sitter / tree-sitter-haskell

Haskell grammar for tree-sitter.
MIT License
155 stars 36 forks source link

Make scanner state global. #67

Closed 414owen closed 2 years ago

414owen commented 2 years ago

You know what Haskellers love? Global mutable state.

@tek ready to merge

maxbrunsfeld commented 2 years ago

The scanner state should really all be created in tree_sitter_haskell_external_scanner_create and destroyed in tree_sitter_haskell_external_scanner_destroy. Otherwise, there will be race conditions if this parser is used on multiple threads simultaneously. Am I missing something?

414owen commented 2 years ago

Ah, I think that there are two states, there's the state that's created in tree_sitter_haskell_external_scanner_create, as you say, and is destroyed in tree_sitter_haskell_external_scanner_destroy, which is an indent_vec.

The state I made global is:

typedef struct {
  TSLexer *lexer;
  const bool *symbols;
  indent_vec *indents;
} State;

Which we (now) set globally in tree_sitter_haskell_external_scanner_scan, to avoid passing the scan() parameters around.

I think this should be okay, right?

maxbrunsfeld commented 2 years ago

I don't think any shared mutable state is ok, if we want the scanner to be capable of working in parallel on multiple threads. And I really would like to have that capability. All of the other Tree-sitter parsers are capable of that.

maxbrunsfeld commented 2 years ago

I think if it's about convenience of not having to pass parameters, I'd rather that we use C++98 (or c++03) and store some of these things as fields on the scanner object.

414owen commented 2 years ago

Huh, I really hadn't considered this use case to be honest.

C11 supports the _Thread_local keyword. I could easily just mark it as _Thread_local. C11 is supported by gcc 4.6 and clang 3.1. Is that portable enough?

Otherwise I'm happy for this PR to be closed :)

maxbrunsfeld commented 2 years ago

My preference would be to just use normal object fields and function parameters rather than relying on _Thread_local; it's just less "exotic", and I don't think thread locals are needed for this purpose. Feel free to push back on that though.

wenkokke commented 2 years ago

Huh, I really hadn't considered this use case to be honest.

C11 supports the _Thread_local keyword. I could easily just mark it as _Thread_local. C11 is supported by gcc 4.6 and clang 3.1. Is that portable enough?

Otherwise I'm happy for this PR to be closed :)

Wouldn't work on macOS, since that needs at most C++03.