I am experiencing a runtime exception while doing a load testing with 10 Million records in my kafka cluster (3 Brokers) and broker crashes every time after this issue is encountered. During my initial analysis I thought the issue could be something to do with the default value of vm.max_map_count, which is 65K, but I have increased it to a higher value of 400K. I have retested the load, the broker crashes again in the middle of processing the range of 3 to 4 million records.
I have a better monitoring in place for the Process and OS sampling and not seeing any performance anomalies during the load testing. Also, the server have a huge amount of disk space left in the data directories. Heap memory is set per broker is 6GB (xmx and xms) of 60 GB RAM Linux Servers.
I have started seeing after the major kafka version upgrade to 3.x. Did anyone experience this issue and do I need to exercise any performance tuning with 3.x testing such high load.
Error while appending records to LoadTestingTopic in dir /broder/log (org.apache.kafka.storage.internals.log.LogDirFailureChannel)
java.io.IOException: Map failed
at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1016)
at org.apache.kafka.storage.internals.log.AbstractIndex.createMappedBuffer(AbstractIndex.java:466)
at org.apache.kafka.storage.internals.log.AbstractIndex.createAndAssignMmap(AbstractIndex.java:104)
at org.apache.kafka.storage.internals.log.AbstractIndex.(http://AbstractIndex.java:82)
at org.apache.kafka.storage.internals.log.OffsetIndex.(http://OffsetIndex.java:69)
at org.apache.kafka.storage.internals.log.LazyIndex.loadIndex(LazyIndex.java:239)
at org.apache.kafka.storage.internals.log.LazyIndex.get(LazyIndex.java:179)
at kafka.log.LogSegment.offsetIndex(LogSegment.scala:67)
at kafka.log.LogSegment.canConvertToRelativeOffset(LogSegment.scala:130)
at kafka.log.LogSegment.ensureOffsetInRange(LogSegment.scala:177)
at kafka.log.LogSegment.append(LogSegment.scala:157)
at kafka.log.LocalLog.append(LocalLog.scala:439)
at kafka.log.UnifiedLog.append(UnifiedLog.scala:911)
at kafka.log.UnifiedLog.appendAsLeader(UnifiedLog.scala:719)
at kafka.cluster.Partition.$anonfun$appendRecordsToLeader$1(Partition.scala:1313)
at kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:1301)
at kafka.server.ReplicaManager.$anonfun$appendToLocalLog$6(ReplicaManager.scala:1210)
at scala.collection.StrictOptimizedMapOps.map(StrictOptimizedMapOps.scala:28)
at scala.collection.StrictOptimizedMapOps.map$(StrictOptimizedMapOps.scala:27)
at scala.collection.mutable.HashMap.map(HashMap.scala:35)
at kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:1198)
at kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:754)
at kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:686)
at kafka.server.KafkaApis.handle(KafkaApis.scala:180)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:149)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.OutOfMemoryError: Map failed
at java.base/sun.nio.ch.FileChannelImpl.map0(Native Method)
at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1013)
I am experiencing a runtime exception while doing a load testing with 10 Million records in my kafka cluster (3 Brokers) and broker crashes every time after this issue is encountered. During my initial analysis I thought the issue could be something to do with the default value of vm.max_map_count, which is 65K, but I have increased it to a higher value of 400K. I have retested the load, the broker crashes again in the middle of processing the range of 3 to 4 million records.
I have a better monitoring in place for the Process and OS sampling and not seeing any performance anomalies during the load testing. Also, the server have a huge amount of disk space left in the data directories. Heap memory is set per broker is 6GB (xmx and xms) of 60 GB RAM Linux Servers.
I have started seeing after the major kafka version upgrade to 3.x. Did anyone experience this issue and do I need to exercise any performance tuning with 3.x testing such high load.
Java Version:
openjdk version "11.0.8" 2020-07-14 LTS OpenJDK Runtime Environment 18.9 (build 11.0.8+10-LTS)
Error while appending records to LoadTestingTopic in dir /broder/log (org.apache.kafka.storage.internals.log.LogDirFailureChannel) java.io.IOException: Map failed at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1016) at org.apache.kafka.storage.internals.log.AbstractIndex.createMappedBuffer(AbstractIndex.java:466) at org.apache.kafka.storage.internals.log.AbstractIndex.createAndAssignMmap(AbstractIndex.java:104) at org.apache.kafka.storage.internals.log.AbstractIndex.(http://AbstractIndex.java:82)
at org.apache.kafka.storage.internals.log.OffsetIndex.(http://OffsetIndex.java:69)
at org.apache.kafka.storage.internals.log.LazyIndex.loadIndex(LazyIndex.java:239)
at org.apache.kafka.storage.internals.log.LazyIndex.get(LazyIndex.java:179)
at kafka.log.LogSegment.offsetIndex(LogSegment.scala:67)
at kafka.log.LogSegment.canConvertToRelativeOffset(LogSegment.scala:130)
at kafka.log.LogSegment.ensureOffsetInRange(LogSegment.scala:177)
at kafka.log.LogSegment.append(LogSegment.scala:157)
at kafka.log.LocalLog.append(LocalLog.scala:439)
at kafka.log.UnifiedLog.append(UnifiedLog.scala:911)
at kafka.log.UnifiedLog.appendAsLeader(UnifiedLog.scala:719)
at kafka.cluster.Partition.$anonfun$appendRecordsToLeader$1(Partition.scala:1313)
at kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:1301)
at kafka.server.ReplicaManager.$anonfun$appendToLocalLog$6(ReplicaManager.scala:1210)
at scala.collection.StrictOptimizedMapOps.map(StrictOptimizedMapOps.scala:28)
at scala.collection.StrictOptimizedMapOps.map$(StrictOptimizedMapOps.scala:27)
at scala.collection.mutable.HashMap.map(HashMap.scala:35)
at kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:1198)
at kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:754)
at kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:686)
at kafka.server.KafkaApis.handle(KafkaApis.scala:180)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:149)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.OutOfMemoryError: Map failed
at java.base/sun.nio.ch.FileChannelImpl.map0(Native Method)
at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1013)