Altinity / clickhouse-sink-connector

Replicate data from MySQL, Postgres and MongoDB to ClickHouse
https://www.altinity.com
Apache License 2.0
216 stars 48 forks source link

All records received during initial snapshot must have zero Uint64(0) _version #800

Open R-omk opened 1 week ago

R-omk commented 1 week ago

.. now there is a timestamp ms plus a serial number, but this value may be greater than the value calculated during cdc event processing, especially when the snowflake.id is disabled.

image

in the example on the screenshot the version is calculated from GTID position, it is obvious that the ReplacingMergeTree engine cannot works properly


context: version: 2.3.0 snowflake.id: "false"

R-omk commented 1 week ago

I would also like to note that snowflake id values ​​are only comparable between values ​​of the same type of snowflake id; they cannot be compared with an arbitrary timestamp.

... especially considering that a custom time offset is used for snowflakeid generator.

https://github.com/Altinity/clickhouse-sink-connector/blob/1027f19af3e94dd1638d7dc986870b3b6842987a/sink-connector/src/main/java/com/altinity/clickhouse/sink/connector/common/SnowFlakeId.java#L16

aadant commented 1 week ago

Yes I agree it should be zero. You get that it you use the python utilities to dump the data. It is not an issue if the _version is always increasing. It should always be the case.