zendesk / maxwell

Maxwell's daemon, a mysql-to-json kafka producer
https://maxwells-daemon.io/
Other
4.04k stars 1.01k forks source link

TRUNCATE DDL is Ignored #1731

Open t3link opened 3 years ago

t3link commented 3 years ago

as we can see, truncate sql is ignored deliberately. however drop table xx will be passed by and stored. what is the reason for this implementation? https://github.com/zendesk/maxwell/blob/222712d58989b64c0cc8c9c9ba99cedb7a0a2236/src/main/java/com/zendesk/maxwell/schema/ddl/SchemaChange.java#L43

osheroff commented 3 years ago

There's nothing logical to output from maxwell when a table is truncated; we don't have a historical record of all the rows.

In theory I suppose we could output a "TRUNCATE" meta-event that signals stream consumers to reset their data. Does your application need something like that?

daledude commented 3 years ago

@osheroff I would find this useful. Actually, I would like all the SQL_BLACKLIST patterns to output for diagnosis/auditing/blaming in a single software. Is there a technical reason for SQL_BLACKLIST or there just wasn't any use cases for those? Is it even feasible to output all the SQL_BLACKLIST?

osheroff commented 3 years ago

The blacklist is there to simplify the life of the SQL parser, with the intention that they’re statements that maxwell can’t/doesn’t care much about.

I can definitely see the value of supporting:

1) a custom (and maybe optional?) “truncate” event that could be output along with a DML stream. There’s some odd mismatch here with partitioning schemes — it’s unclear what to do with a TRUNCATE event on a stream not partitioned by database or table, but could probably be worked around.

2) could output DDL events for unknown/unparsed sql events.

On Oct 22, 2021, at 14:30, Dale Dude @.***> wrote:  @osheroff I would find this useful. Actually, I would like all the SQL_BLACKLIST patterns to output for diagnosis/auditing/blaming in a single software. Is there a technical reason for SQL_BLACKLIST or there just wasn't any use cases for those? Is it even feasible to output all the SQL_BLACKLIST?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.