lucid-crystal / compiler

MIT License
4 stars 2 forks source link

String Interpolation #29

Open devnote-dev opened 13 hours ago

devnote-dev commented 13 hours ago

This is a bit complex to implement as it requires state for tracking string parts in all implementations. The lexer already uses as little state as possible so I'd like to keep string state minimal. There are 2 ways I can see this being implemented:

  1. Lex the string parts individually then piece the parts together in the parser
  2. Lex the string parts under one method/stream then return the lexer string parts as an array of tokens

This would also directly affect how we handle heredoc lexing/parsing. There are benefits to both versions, particularly 2 which will make lexing heredocs easier but is more complicated to implement. I'm also open to other ideas for handling this.

nobodywasishere commented 13 hours ago

This was the approach I was thinking about testing with larimar (similar to approach 1):

"hello #{there} my #{name}"

- `"hello #{` - INTERP_STRING_START
- `there`     - IDENT
- `} my #{`   - INTERP_STRING_PART
- `name`      - IDENT
- `}"`        - INTERP_STRING_END

And similar for heredocs:

<<-TEXT
some text here
#{interpolation}
other text here
TEXT

- `<<-TEXT`              - HEREDOC_START
- `\nsome text here\n#{` - HEREDOC_BODY_PART
- `interpolation`        - IDENT
- `}\nother text here`   - HEREDOC_BODY_PART
- `TEXT`                 - HEREDOC_BODY_END
devnote-dev commented 13 hours ago

That's great! My only worry with this approach is for nested interpolation. At the lexer level this is fine if we're not using state, however with the parser, there is the possibility for nested interpolation to be swallowed. Then again, I haven't tested this so it could be an edge case that is ruled out by the interpolation logic...

nobodywasishere commented 12 hours ago

Nested interpolation should be fine, the parsing logic would probably something like:

start = consume(:string_interp_start)

expressions = Array(Node).new
expressions << start

while true
  # if a string_interp_start occurs in this, it will be parsed recursively
  expressions << parse_op_assign

  if current_token.kind.string_interp_part?
    expressions << consume(:string_interp_part)
  else
    expressions << consume(:string_interp_end)
    break
  end
end

StringInterp.new(expressions)