goccmack / gocc

Parser / Scanner Generator
Other
604 stars 48 forks source link

Feature/user context #116

Closed kfsone closed 3 years ago

kfsone commented 3 years ago

This changes aims to allow users to provide themselves with stateful context information through the parser into the SDT via a new '$Context' token and into tokens through the lexer by adding a "Context" property to Pos.

In the simplest case, I modified "NewLexerFile" to use the lexer context to provide itself a way to identify which file a token originated from, which will allow for easy File:Line:Column representation of errors (see https://github.com/goccmack/gocc/pull/115)

Many use cases can probably make do with the token Context, but more complex cases - such as languages that want to traverse multiple imports or dsls, etc - may need to have a separate context for the parser.

Idiomatically, the lexical context should be immutable: a frozen snapshot of stateful/contextual information from when the token was generated, while the parser context is dynamic during the parse phase.

type Lexical struct {
  ImportPath []string
}

type Parser struct {
  ImportPath []string
}

func main() {
  symbols, parseContext := parse("filename.txt")
  // symbols[n].ImportPath == ["A", "B", "C", "D']
  // parseContext.ImportPath == []
}
kfsone commented 3 years ago

@awalterschulze this replaces https://github.com/goccmack/gocc/pull/112

kfsone commented 3 years ago

Combined with https://github.com/goccmack/gocc/pull/115, it becomes possible to automatically include filenames for NewLexerFile parses, or any scenario where the tokens have a context that implements the SourceContext interface.

kfsone commented 3 years ago

Also noting that I'm not the first to want somewhere to keep my own pointer: https://github.com/goccmack/gocc/commit/ebce4330a49a5d55906838a9a0e3cbc3eb0a16b1#diff-7f64cc64f92c6ef88fe70af2757c0b7e1f88008e6f5775f25afcd6076c3065f7R152-R155 (give it a minute to load)

kfsone commented 3 years ago

rebased

kfsone commented 3 years ago

@awalterschulze A side effect of this change is now to bring together the $T0 and Human Error changes to make gocc actually FLC when a Sourcer() context is used, as happens when you use NewLexerFile, or use any Context object that has a 'Source()' method, for which I added a test to example/errormsg:

138 func TestErrors_ErrorConext(t *testing.T) {
139     // Create an error with a token that has a SourceContext property, and
140     // verify the filename is injected into the error message.
141     err := &errors.Error{ErrorToken: mockToken(111, "moyles", 16, 5), ExpectedTokens: []string{"ant", "dec"}}
142     assertEqual(t, `16:5: error: expected either ant or dec; got: "moyles"`, err.Error())
143
144     err.ErrorToken.Context = &lexer.SourceContext{Filepath: "/addicted/to/plaice.lyrics"}
145     assertEqual(t, `/addicted/to/plaice.lyrics:16:5: error: expected either ant or dec; got: "moyles"`, err.Error())145 }
awalterschulze commented 3 years ago

I guess I can get behind making FilePath a default field of type *string in Lexer and then in the NewLexer function just using nil, but when fpath is provided we can give the string.

To add the new Context type that the user can manipulate, I am still unsure about and I am still deferring to goccmack on that one.

goccmack commented 3 years ago

@kfsone Thanks for the contribution.