JuliaLang / JuliaSyntax.jl

The Julia compiler frontend
Other
274 stars 33 forks source link

`bump_glue` `next_byte` disregard `num_tokens` #331

Closed sunxd3 closed 1 year ago

sunxd3 commented 1 year ago

MWE:

julia> ps = ParseStream("a + b - c * d"); peek(ps);

julia> bump_glue(ps, K"Identifier", EMPTY_FLAGS, 4)
JuliaSyntax.ParseStreamPosition(0x00000002, 0x00000000)

julia> ps.tokens[end] |> dump
JuliaSyntax.SyntaxToken
  head: JuliaSyntax.SyntaxHead
    kind: JuliaSyntax.Kind K"Identifier"
    flags: UInt16 0x0000
  orig_kind: JuliaSyntax.Kind K"Identifier"
  preceding_whitespace: Bool false
  next_byte: UInt32 0x00000003

If I understand correctly, I would expect the next_byte to be 0x00000006 as,

julia> JuliaSyntax.peek_token(ps)
Identifier      |6

Maybe at https://github.com/JuliaLang/JuliaSyntax.jl/blob/fc572f95c250f802c6721f90061125b05422965e/src/parse_stream.jl#L738 should be

                                     stream.lookahead[i+num_tokens].next_byte))

and maybe bound checking?

This effects build_tree as https://github.com/JuliaLang/JuliaSyntax.jl/blob/fc572f95c250f802c6721f90061125b05422965e/src/parse_stream.jl#L1075

c42f commented 1 year ago

bump_glue is a bit of an unfortunate anomaly that I wish I could remove; it only needs to exist to resolve a lexing ambiguity with signed numbers:

Anyway, so it seems the easiest way forward here is to just remove the num_tokens parameter entirely, as we only ever use bump_glue with num_tokens == 2.

sunxd3 commented 1 year ago

That sounds reasonable to me. PR #338