neverchanje / chgcap-rs

A CDC library in Rust.
Apache License 2.0
13 stars 1 forks source link

feat: support handling ddl events (schema changes) #4

Open neverchanje opened 1 year ago

neverchanje commented 1 year ago

MySQL sends QueryEvent through binlog for each query, including CREATE TABLE and ALTER TABLE. The QueryEvent only contains the unparsed query, and therefore we have to parse it for details.

We'll only handle CREATE TABLE and ALTER TABLE at the begining. Debezium handles more types, including DROP_TABLE, TRUNCATE_TABLE, CREATE_INDEX, DROP_INDEX, CREATE_DATABASE, ALTER_DATABASE, DROP_DATABASE, USE_DATABASE, and SET_VARIABLE. (See io.debezium.relational.ddl.DdlChanges).

TableMapEvent is another event that contains schema data. It precedes every DML statements, enabled only in row-based mode. To utilize this event, we need to require users to enable row-based replication before CDC. Since it doesn't directly represent DDL, we won't not use it to monitor schema changes.

neverchanje commented 9 months ago

The DDL parser should parse only the information that we are concerned, including:

For key/table constraints, we will only extract primary keys for now. But we'll reconsider other types if there is a requirement in the future.

The parser is more like an extractor than a general parser. It extracts the table schema and skips irrelevant parts. Therefore, its interface will be as follows:

impl Parser {
  fn parse_statement(&mut self, sql: &str) -> Option<Statement> 
}

It should return None instead of a Result error when parsing fails. Because the upstream database must have validated the query already and the query must be correct. It returns None only if the queries are skippable. For unexpected tokens, the parser should unwrap instead of throwing errors.