AnyhowStep / sql-compiler

An experimental SQL compiler
MIT License
0 stars 0 forks source link

Notes for @@variable and dot-identifier #16

Open AnyhowStep opened 3 years ago

AnyhowStep commented 3 years ago

https://github.com/mysql/mysql-server/blob/5c8c085ba96d30d697d0baa54d67b102c232116b/sql/gen_lex_token.cc#L286

AnyhowStep commented 3 years ago

Seems like m_append_space controls whether whitespace is allowed after the token

AnyhowStep commented 3 years ago

https://github.com/mysql/mysql-server/blob/3e90d07c3578e4da39dc1bce73559bbdf655c28c/sql/gen_lex_token.cc#L120

AnyhowStep commented 3 years ago

https://github.com/mysql/mysql-server/blob/5c8c085ba96d30d697d0baa54d67b102c232116b/sql/sql_lex.cc#L1606

Given SELECT T . SELECT,

  1. Lexer sees '.', goes to MY_LEX_IDENT_SEP state https://github.com/mysql/mysql-server/blob/5c8c085ba96d30d697d0baa54d67b102c232116b/sql/sql_lex.cc#L2052
  2. Lexer does not see an identifier character and goes to MY_LEX_START state https://github.com/mysql/mysql-server/blob/5c8c085ba96d30d697d0baa54d67b102c232116b/sql/sql_lex.cc#L1612
  3. Lexer skips whitespace https://github.com/mysql/mysql-server/blob/5c8c085ba96d30d697d0baa54d67b102c232116b/sql/sql_lex.cc#L1400
  4. Lexer goes to MY_LEX_IDENT state https://github.com/mysql/mysql-server/blob/5c8c085ba96d30d697d0baa54d67b102c232116b/sql/sql_lex.cc#L1411 https://github.com/mysql/mysql-server/blob/3e90d07c3578e4da39dc1bce73559bbdf655c28c/mysys/sql_chars.cc#L88
  5. An attempt to find keyword with bool function = false https://github.com/mysql/mysql-server/blob/5c8c085ba96d30d697d0baa54d67b102c232116b/sql/sql_lex.cc#L1568 https://github.com/mysql/mysql-server/blob/5c8c085ba96d30d697d0baa54d67b102c232116b/sql/sql_lex.cc#L862
  6. Keyword symbol SELECT is found https://github.com/mysql/mysql-server/blob/5c8c085ba96d30d697d0baa54d67b102c232116b/sql/sql_lex.cc#L866 https://github.com/mysql/mysql-server/blob/5c8c085ba96d30d697d0baa54d67b102c232116b/sql/sql_lex.cc#L892
  7. Token ID for SELECT is returned https://github.com/mysql/mysql-server/blob/5c8c085ba96d30d697d0baa54d67b102c232116b/sql/sql_lex.cc#L1571

No clue where ER_PARSE_ERROR (1064) is thrown, though.

And given SELECT T.SELECT, we must have roughly the same flow as above, but with no errors.

AnyhowStep commented 3 years ago

Okay, problem solved (I think).

  1. Lexer sees '.' and goes to MY_LEX_IDENT_SEP state
  2. Lexer goes to MY_LEX_IDENT_START state (this is different from MY_LEX_IDENT!!!) https://github.com/mysql/mysql-server/blob/5c8c085ba96d30d697d0baa54d67b102c232116b/sql/sql_lex.cc#L1610
  3. Lexer sees ident_map[lip->yyPeek()] is true for .SELECT, false for . SELECT. https://github.com/mysql/mysql-server/blob/5c8c085ba96d30d697d0baa54d67b102c232116b/sql/sql_lex.cc#L1611

So,

In the entirety of MY_LEX_IDENT_START, we do not see it call find_keyword()! https://github.com/mysql/mysql-server/blob/5c8c085ba96d30d697d0baa54d67b102c232116b/sql/sql_lex.cc#L1671-L1712

It does, however, call it for MY_LEX_START https://github.com/mysql/mysql-server/blob/5c8c085ba96d30d697d0baa54d67b102c232116b/sql/sql_lex.cc#L1568


ident_map is referenced here. It is populated here, https://github.com/mysql/mysql-server/blob/3e90d07c3578e4da39dc1bce73559bbdf655c28c/mysys/sql_chars.cc#L120

ident_map is basically an array of booleans.

If the character is a valid identifier character, the value is true. Otherwise, false. Whitespace is not a valid identifier character. So, the value is false.

AnyhowStep commented 3 years ago

In the case of @@variable,

  1. We start at MY_LEX_START
  2. We set state = state_map[c], it is MY_LEX_USER_END https://github.com/mysql/mysql-server/blob/3e90d07c3578e4da39dc1bce73559bbdf655c28c/mysys/sql_chars.cc#L112
  3. We check state_map[lip->yyPeek()] and see it is MY_LEX_USER_END again. We set state to MY_LEX_SYSTEM_VAR https://github.com/mysql/mysql-server/blob/5c8c085ba96d30d697d0baa54d67b102c232116b/sql/sql_lex.cc#L2068
  4. If we don't see backtick, we go to MY_LEX_IDENT_OR_KEYWORD https://github.com/mysql/mysql-server/blob/5c8c085ba96d30d697d0baa54d67b102c232116b/sql/sql_lex.cc#L2090
  5. We attempt to read an identifier https://github.com/mysql/mysql-server/blob/5c8c085ba96d30d697d0baa54d67b102c232116b/sql/sql_lex.cc#L2099
  6. The length is zero if we do not see an identifier https://github.com/mysql/mysql-server/blob/5c8c085ba96d30d697d0baa54d67b102c232116b/sql/sql_lex.cc#L2106
  7. This causes us to return ABORT_SYM

This is why @@variable is not allowed to have any whitespace between any of the three tokens.