Closed fqtab closed 7 months ago
We are already memory hungry. This would be very memory hungry in the cases where someone is fanning out to multiple tables at once and have a lot of partitions. Wouldn't be surprised if you 2x-5xd the memory usage in some cases.
We are already memory hungry. This would be very memory hungry in the cases where someone is fanning out to multiple tables at once and have a lot of partitions. Wouldn't be surprised if you 2x-5xd the memory usage in some cases.
+1, more concerning to me is moving to a model were we must create a file per partition per commit, which will result in many more files created than needed in many cases. This can have a negative impact on query planning performance, scan performance, storage size, and so on. Aggressive compaction can help mitigate that to some extent but that can drive up compute and storage costs also.
At my current company we will often partition a topic for future scalability using a highly composite number of partitions, often several thousand. This allows us to scale the number of processing tasks without introducing skew. I feel it is too limiting not being able to coalesce partitions when writing.
Cool, I did call this out explicitly as a potential issue in the PR description but admittedly I did not fully consider the impact of this. Appreciate the clear feedback and agree it doesn't make sense to move forward with the topic-partition-level processing model. Taken the rest of the useful changes from this PR and created a new PR.
This DRAFT is just to give interested stakeholders an idea of where/how I'm planning to refactor things to enable:
kafka-connect/src/main/java/io/tabular/iceberg/connect/api/Committer.java
kafka-connect/src/main/java/io/tabular/iceberg/connect/committer/v1/CommitterImpl.java
kafka-connect/src/main/java/io/tabular/iceberg/connect/committer/v2/CommitterImpl.java
kafka-connect-runtime/src/test/java/io/tabular/iceberg/connect/IntegrationControlClusterTest.java
The plan is to break this up into smaller PRs with wayyyy better code (not to mention better test coverage) Also, there are breaking changes as well as unnecessary changes in this PR (because this is just a POC).
The main downside of this refactor is we create a TaskWriter per topic partition per table per in a given Task now. Previously, we only create a TaskWriter per table in given Task. This is necessary to enable producer generation based fencing. However this likely means higher memory usage (think particularly about the table fanout usecase).
PRs planned: