tabular-io / iceberg-kafka-connect

Apache License 2.0
169 stars 31 forks source link

Expected behaviour when dropping columns? #252

Closed aurany closed 1 month ago

aurany commented 1 month ago

Adding columns seems to work fine using evolve-schema-enabled: true but dropping columns does not affect the resulting iceberg table. Is this expected?

I'm using cassandra and debezium with cdc enabled.

      "tasks.max": "1",
        "topics": "debezium.the_shop.customers,debezium.the_shop.products,debezium.the_shop.orders",
        "connector.class": "io.tabular.iceberg.connect.IcebergSinkConnector",
        "iceberg.catalog.s3.endpoint": "http://minio:9000",
        "iceberg.catalog.s3.secret-access-key": "password",
        "iceberg.catalog.s3.access-key-id": "admin",
        "iceberg.catalog.s3.path-style-access": "true",
        "iceberg.catalog.uri": "http://rest:8181",
        "iceberg.catalog.warehouse": "s3://warehouse/",
        "iceberg.catalog.client.region": "us-east-1",
        "iceberg.catalog.type": "rest",
        "iceberg.control.commit.interval-ms": "1000",
        "iceberg.tables.auto-create-enabled": "true",
        "iceberg.tables.evolve-schema-enabled": "true",
        "iceberg.tables.route-field": "__table",
        "iceberg.tables.cdc-field": "__op",
        "iceberg.tables.upsert-mode-enabled": "true",
        "iceberg.tables": "the_shop.customers,the_shop.products,the_shop.orders",
        "iceberg.table.the_shop.customers.route-regex": "customers",
        "iceberg.table.the_shop.products.route-regex": "products",
        "iceberg.table.the_shop.orders.route-regex": "orders",
        "iceberg.table.the_shop.customers.id-columns": "customer_id",
        "iceberg.table.the_shop.products.id-columns": "product_id",
        "iceberg.table.the_shop.orders.id-columns": "order_id",
        "key.converter": "org.apache.kafka.connect.json.JsonConverter",
        "key.converter.schemas.enable": "true",
        "value.converter": "org.apache.kafka.connect.json.JsonConverter",
        "value.converter.schemas.enable": "true",
        "transforms": "unwrap",
        "transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState",
        "transforms.unwrap.delete.tombstone.handling.mode": "rewrite",
        "transforms.unwrap.add.fields": "op,table,source.ts_ms"
tabmatfournier commented 1 month ago

The connector won't delete columns. The most it will do is make one optional.

tabmatfournier commented 1 month ago

I was incorrect. There is a bug --the connector can throw when columns are deleting. We are in the process of fixing this. Will link the PR when it is available.