Open romanlv opened 4 years ago
Looks like setting log.retention should help
will try
cp-kafka:
configurationOverrides:
"log.retention.hours": 24
The question is: why is it filling up space? This is happening with a default install, and nothing is actually using the cluster aside from itself.
Also experiencing this issue, above retention hours doesn't seem to have worked
Yep. I even tried changing all the topics to a 1GB retention as well, and it still fills up after a couple days.
I just deployed a cluster via operator which I guess would be the same thing as deploying it by charts. It ran out of disk immediately. I re run the deployment for kafka only like: `cat confluent-kafka-only.yml apiVersion: platform.confluent.io/v1beta1 kind: Kafka [...] configOverrides: server:
To change the retention to something smaller, however this wont clean up existing storage which is exhausted anyway.
Can I get any help, like some guidence of how to clean up filled up log space?
I may end up redeploying the whole thing with the overides above but seems like other have similar problem even with this flag enabled.
Any help much appreciated.
At pod boot time I get:
[ERROR] 2021-08-26 15:45:00,617 [pool-7-thread-1] kafka.server.LogDirFailureChannel error - Error while writing to checkpoint file /mnt/data/data0/logs/_confluent_balancer_broker_samples-13/leader-epoch-checkpoint java.io.IOException: No space left on device
We are facing the same issue using chart version: 0.6.1. Log retention is not addressing this issue. Increasing PVC size just delays the no space left on device.
I'm still seeing this issue. When I exec into the kafka broker pod, the file in question (opt/kafka/data-0...) does not even exist. Why does it say out of space when the file in question is not even there? BTW - I have all the log retention settings correct, and they show up in Confluent Control center as expected (ie, 1 hour retention, 1M size limit, etc.) It's like the kafka log retention code is not working at all.
Seeing this same issue, has anyone made progress on this? We've tried overriding the log retention using both time and bytes size with no luck.
What you want to do is change the log retention policy to delete. That fixes the issue. I can drop my config file here if needed.
@BenM-Mycelium thanks for the response! Are you meaning setting something like "log.cleanup.policy": "delete"
?
Yes correct
Testing this out now, thanks again for the help! For anyone else peeking in, an additional setting we're using that I didn't fully understand is the log.retention.bytes
. When I looked at the documentation closer, this limit is enforced at the partition level and not the topic level. For my project, we're using 8 partitions (so 8x the limit that I anticipated) which left my disk woefully undersized. I'll let this run for a bit to see if the delete policy functions as expected.
How did you go out of interest?
How did you go out of interest?
Hey @BenM-Mycelium, I'm just now seeing this reply sorry about that. I ended up boosting the disk size quite a bit and setting a short expiration (10 minutes) with a 0.25GB log.retention.bytes setting. At this point things are up and running and I can see the topics level off at appropriate size.
Running standard configuration in Google cloud with
ksql
andconnect
disabledIt works fine for several days (3-4 days) with minimum usage (it's a dev environment) but eventually something occupies all available disk space
It looks like
TRACE
log level is active for kafka brokers, but i'm not sure how to change it, tried withKAFKA_LOG4J_ROOT_LOGLEVEL
but it does not make any difference
How to change log level or enable logs rotation?