Hi, I'm now working on a PoC to sink logs from Kafka to Iceberg format, I want to partition the logs to under year=YYYY/month=MM/day=DD, but I only have a timestamp inside the log.
I didn't found any configurations on how to partition logs with timestamp, so wondering if there any workarounds existing already?
I think there is an workaround to use SMT to duplicate the ts_ms into year, month, day, then extract data into 3 different fields and set to iceberg.tables.default-partition-by, however it makes the connector config dirty yet requires to code a custom SMT function...
Hi, I'm now working on a PoC to sink logs from Kafka to Iceberg format, I want to partition the logs to under
year=YYYY/month=MM/day=DD
, but I only have a timestamp inside the log.I didn't found any configurations on how to partition logs with timestamp, so wondering if there any workarounds existing already?
I think there is an workaround to use SMT to duplicate the
ts_ms
intoyear
,month
,day
, then extract data into 3 different fields and set toiceberg.tables.default-partition-by
, however it makes the connector config dirty yet requires to code a custom SMT function...For a detailed example, my log format is like
Thanks.