Altinity / clickhouse-sink-connector

Replicate data from MySQL, Postgres and MongoDB to ClickHouse
https://www.altinity.com
Apache License 2.0
215 stars 48 forks source link

connector loses genarated columns #743

Open tellienne opened 1 month ago

tellienne commented 1 month ago

Hi! we use a connector to transfer data from postgres to clickhouse. version: altinityinfra/clickhouse-sink-connector: 716-f3f63a01328216e766609fa8280adbe93b7ea320-lt

noticed that after some time the connector loses some columns in the tables. logs entry:

2024-08-10 05:20:25.496 
2024-08-09 22:20:25.496 ERROR - ********** ERROR: Database(test), Table(supplier_product_cards), Setting column business_id to NULL might fail for non-nullable columns ************
2024-08-10 05:20:25.496 
2024-08-09 22:20:25.496 ERROR - ********** ERROR: Database(test), Table(supplier_product_cards), ClickHouse column business_id not present in source ************

looks like the column was removed from the table in postgres, however the table schema was not changed, the column exists. the column cannot be empty, but the connector thinks it doesn't exist and tries to insert null. Because of this, data transfer stops.

the columt business_id has ddl like

business_id varchar(255) NOT NULL GENERATED ALWAYS AS (((12::character varying(255)::text || lpad(number::character varying(255)::text, 6, 0::character varying(255)::text)))) STORED

the problem has been reproduced several times for tables where there is such a generated column like business_id

tellienne commented 1 month ago

UPD: we found this issues https://www.postgresql.org/message-id/283bd642-91f7-4b70-a331-7095e0dae13d%40app.fastmail.com

and email about patch https://www.postgresql.org/message-id/CAHv8RjKEBtFfUbmdHPYV1K76bb8PEm6R%3DiLZko4Dxy2Oh-_uUw%40mail.gmail.com

tellienne commented 1 month ago

we tried to exclude the column from replication and make it materialized on the clickhouse side:

column.exclude.list: "public.test.business_id"

and after that:

ALTER TABLE test.supplier_product_cards ON CLUSTER '{cluster}' modify COLUMN business_id String MATERIALIZED (((12::character varying(255)::text || lpad(number::character varying(255)::text, 6, 0::character varying(255)::text))))

but the connector is still trying to insert data into this column. seems to be related to this https://github.com/Altinity/clickhouse-sink-connector/issues/357