Closed JackCoulson closed 3 years ago
A large number of RecordAccumulator and Bufferpool are in memory until OutOfMemoryError
When I look at the source code of the producerRemoved method of MicrometerProducerListener,I found that there is KafkaClientMetrics class in MicrometerProducerListener of spring-kafka, but there is no jar of KafkaClientMetrics class. I don’t know if it matters.
The following log information has been repeated extensively:
2021-08-27 22:50:00 546 ERROR [] org.springframework.kafka.support.LoggingProducerListener error:254|Exception thrown when sending a message with key='null' and payload='{"TimeStamp":"2021-08-27 22:50:00","requestId":"88d30aa2-f913-d718-cf09-a1f3c7da900e","spanId":"0","...' to topic iLogsTopic:
org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker received an out of order sequence number.
2021-08-27 22:50:00 546 ERROR [] org.springframework.kafka.support.LoggingProducerListener error:254|Exception thrown when sending a message with key='null' and payload='{"TimeStamp":"2021-08-27 22:50:00","requestId":"ad9d1acf-5f09-b530-8407-317c6c665299","spanId":"0","...' to topic iLogsTopic:
org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker received an out of order sequence number.
2021-08-27 22:50:00 682 ERROR [] org.springframework.kafka.support.LoggingProducerListener error:254|Exception thrown when sending a message with key='null' and payload='{"TimeStamp":"2021-08-27 22:50:00","requestId":"ddc005ed-a7e1-04ac-c8e6-d62d920dbf34","spanId":"0","...' to topic iLogsTopic:
org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker received an out of order sequence number.
Repeat the above.....
2021-08-27 22:50:01 269 WARN [] org.springframework.kafka.core.DefaultKafkaProducerFactory warn:262|Error during some operation; producer removed from cache: CloseSafeProducer [delegate=org.apache.kafka.clients.producer.KafkaProducer@36539996]
2021-08-27 22:50:01 269 INFO [] org.apache.kafka.clients.producer.KafkaProducer close:1182|[Producer clientId=producer-7] Closing the Kafka producer with timeoutMillis = 30000 ms.
2021-08-27 22:50:01 269 WARN [] org.apache.kafka.clients.producer.KafkaProducer close:1189|[Producer clientId=producer-7] Overriding close timeout 30000 ms to 0 ms in order to prevent useless blocking due to self-join. This means you have incorrectly invoked close with a non-zero timeout from the producer call-back.
2021-08-27 22:50:01 269 INFO [] org.apache.kafka.clients.producer.KafkaProducer close:1208|[Producer clientId=producer-7] Proceeding to force close the producer since pending requests could not be completed within timeout 30000 ms.
2021-08-27 22:50:01 270 INFO [] org.apache.kafka.clients.producer.ProducerConfig logAll:347|ProducerConfig values:
acks = 1
batch.size = 5242880
bootstrap.servers = [14.158.162.101:9082]
buffer.memory = 104857600
client.dns.lookup = default
client.id = producer-8
compression.type = none
connections.max.idle.ms = 540000
delivery.timeout.ms = 120000
enable.idempotence = false
interceptor.classes = []
key.serializer = class org.apache.kafka.common.serialization.StringSerializer
linger.ms = 1
max.block.ms = 10000
max.in.flight.requests.per.connection = 5
max.request.size = 1048576
metadata.max.age.ms = 300000
metadata.max.idle.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
receive.buffer.bytes = 32768
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
retries = 5
retry.backoff.ms = 100
sasl.client.callback.handler.class = null
sasl.jaas.config = null
sasl.kerberos.kinit.cmd = /usr/bin/kinit
sasl.kerberos.min.time.before.relogin = 60000
sasl.kerberos.service.name = null
sasl.kerberos.ticket.renew.jitter = 0.05
sasl.kerberos.ticket.renew.window.factor = 0.8
sasl.login.callback.handler.class = null
sasl.login.class = null
sasl.login.refresh.buffer.seconds = 300
sasl.login.refresh.min.period.seconds = 60
sasl.login.refresh.window.factor = 0.8
sasl.login.refresh.window.jitter = 0.05
sasl.mechanism = GSSAPI
security.protocol = PLAINTEXT
security.providers = null
send.buffer.bytes = 131072
ssl.cipher.suites = null
ssl.enabled.protocols = [TLSv1.2]
ssl.endpoint.identification.algorithm = https
ssl.key.password = null
ssl.keymanager.algorithm = SunX509
ssl.keystore.location = null
ssl.keystore.password = null
ssl.keystore.type = JKS
ssl.protocol = TLSv1.2
ssl.provider = null
ssl.secure.random.implementation = null
ssl.trustmanager.algorithm = PKIX
ssl.truststore.location = null
ssl.truststore.password = null
ssl.truststore.type = JKS
transaction.timeout.ms = 60000
transactional.id = null
value.serializer = class org.apache.kafka.common.serialization.StringSerializer
2021-08-27 22:50:01 273 INFO [] org.apache.kafka.common.utils.AppInfoParser <init>:117|Kafka version: 2.5.0
2021-08-27 22:50:01 274 INFO [] org.apache.kafka.common.utils.AppInfoParser <init>:118|Kafka commitId: 66563e712b0b9f84
2021-08-27 22:50:01 274 INFO [] org.apache.kafka.common.utils.AppInfoParser <init>:119|Kafka startTimeMs: 1630075801273
Please read up on GitHub markdown to properly format code you need three back-ticks on a separate line before and after the code block.
This was fixed in Micrometer 1.7.
https://github.com/micrometer-metrics/micrometer/issues/2018
@garyrussell Thank you very much. I was anxious and forgot to format the code. Sorry. I changed it.
Closing on the assumption that upgrading solved the problem; please reopen if you are still having problems.
version:2.5.6.RELEASE desc: Last Friday, our server was restarted on a large scale. The dump file showed that a large number of Producers were created and shut down, but the Producer still held memory after shutting down. To be precise, a large number of RecordAccumulator and Bufferpool referenced in KafkaMetric were not released.
Why doesn't the memory resource be released after the Producer is closed? I don't know what to do now, ask for help. I only created a KafkaTemplate