risingwavelabs / risingwave

SQL stream processing, analytics, and management. We decouple storage and compute to offer efficient joins, instant failover, dynamic scaling, speedy bootstrapping, and concurrent query serving.
https://www.risingwave.com/slack
Apache License 2.0
6.66k stars 546 forks source link

iceberg sink: struct type with metadata doesn't work #16545

Open xxchan opened 3 months ago

xxchan commented 3 months ago

Hi, I've recently been thinking about support for Struct type in Iceberg sink, since I'm testing if I can utilise RisingWave at work and such functionality is a necessity. As of now when someone tries to sink struct data to iceberg catalog they receive an error Field response's type not compatible, risingwave converted data type Struct([Field { name: "responseStatus", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "statusCode", data_type: Int32, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]), iceberg's data type: Struct([Field { name: "responseStatus", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {"PARQUET:field_id": "17", "column_id": "17"} }, Field { name: "statusCode", data_type: Int32, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {"PARQUET:field_id": "18", "column_id": "18"} }]) Looking at this it seems it is only a matter of a mismatch on metadata field in each Field. The code just does left == right comparison https://github.com/risingwavelabs/risingwave/blob/main/src/connector/src/sink/iceberg/mod.rs#L1046

Is Struct support in Iceberg sink just a matter of lack of correct comparison or is there more context to that?

Slack Message

fuyufjh commented 2 months ago

cc. @chenzl25

chenzl25 commented 2 months ago

@ZENOTME Could you please check whether we support struct type in iceberg sink? IIUC, after this PR #16567 , we could support it directly.

ZENOTME commented 2 months ago

@ZENOTME Could you please check whether we support struct type in iceberg sink? IIUC, after this PR #16567 , we could support it directly.

Sure, I test it later. BTW, there is also no test for struct type in icelake so I am not sure whether it's supported.

github-actions[bot] commented 2 weeks ago

This issue has been open for 60 days with no activity.

If you think it is still relevant today, and needs to be done in the near future, you can comment to update the status, or just manually remove the no-issue-activity label.

You can also confidently close this issue as not planned to keep our backlog clean. Don't worry if you think the issue is still valuable to continue in the future. It's searchable and can be reopened when it's time. 😄