Closed danderson closed 3 years ago
Code for my lexer is at https://github.com/danderson/go-kdl . Please excuse the state of the code, I've been getting it to lex correctly before tidying it up. go test -v -run=TestConformance/testdata/valid/unusual_chars_in_bare_id.kdl .
reproduces the lexing failure I listed above.
oh yeah, that test looks wrong. Would you mind updating it to exclude characters based on the actual spec? The identifier grammar changed a lot towards the end, and the tests and examples were clearly not updated for it.
(To be clear: /
is NOT a valid identifier char)
Thanks for the confirmation, I'll send a PR to adjust the tests.
While implementing a Go lexer for KDL, I hit a test failure that I think is an inconsistency between the spec and the test input https://github.com/kdl-org/kdl/blob/main/tests/test_cases/input/unusual_chars_in_bare_id.kdl .
The input:
The expected result is a valid bare identifier followed by a string, as indicated in https://github.com/kdl-org/kdl/blob/main/tests/test_cases/expected_kdl/unusual_chars_in_bare_id.kdl . What my lexer gets is:
This difference occurs because I lex bare identifiers according to https://github.com/kdl-org/kdl/blob/main/SPEC.md#identifier , which states that non-identifier characters cannot appear anywhere in a bare identifier, and https://github.com/kdl-org/kdl/blob/main/SPEC.md#non-identifier-characters says that
/
is a non-identifier character. Thus, my lexer emits an identifier when it sees the/
(since the identifier is clearly over), then tries to continue lexing from there as a comment and fails because/?
isn't a known comment form.(ignore for a moment that this set of lexer tokens isn't valid KDL, my parser would have picked that up, the lexer is less clever :) )
Am I misreading the spec, or does this test input indeed contain an invalid identifier?