Closed ivanyu closed 1 year ago
I assume this occurs for 3.3?
Reproduces on 3.3-2022-10-06-tiered-storage
, but works as expected on 3.0-li
.
Providing an update on whats going on here. The reason behind this change doesn't appear to be a regression that was introduced when cherry picking the changes from 3.0-li
but rather changes that came from upstream. This can be verified by the fact that there hasn't actually been any changes to LogCleaner.scala
in any of the cherry picked commits.
Rather this change https://github.com/aiven/kafka/commit/6c806430093c152b916d8e8f1fa542980c03a181 in upstream has effected how log cleanup in a pretty fundamental way and due to this I believe it is the root cause (or more accurately, additional changes are necessary in order to get this upstream change working with TS). I do have changes done locally to "fix" the problem but I am being cautious because of how large that change is.
Due to this the better course of action would be to make a change to kafka upstream and once its merged then backport it into kafka 3.3 -> 3.5.
@ivanyu @jeqo I have pushed a fix at https://github.com/aiven/kafka/commit/0adaef725acde77c5f7676dd3d4fb91b66dbba0e . I have tested it within Kafka as well as using https://github.com/ivanyu/kafka-tiered-storage-demo and it appears to work (the Kafka version with the fix is cleaning up much less records than the version without the fix).
I do have a nagging feeling that this "fix" is in fact just a workaround, at least from reading the code it seems that LogCleaner
needs to properly be adapted in order to handle TS rather than just forcing the old legacy behaviour which is what the fix does.
@ivanyu Can you confirm that this works on your end? If so I can close this ticket and make a new one to investigate a more principled solution.
Tried a couple of times, worked. Considering it reproduced rather consistently, let's declare it fixed (or "fixed enough" if it's a workaround :)) @mdedetrich.
Creating a topic with the config
and filling it with
results in local segments being deleted, but not uploaded to the remote tier. Probably, the log cleaner doesn't have a condition that prevents it from deleting a segment that is eligible for the remote tier, but hasn't been uploaded.