Closed jimymodi closed 5 years ago
Hmmm. THat is interesting. I didn't observe that for our production systems. Debugging...
Hey @jimymodi I gave it a shot with a setup like that and I don't see the problem:
require 'kafka'
require 'memory_profiler'
require 'get_process_mem'
kafka = Kafka.new(
'kafka://127.0.0.1'
)
consumer = kafka.consumer(
offset_commit_threshold: 100,
group_id: 'dadadada',
fetcher_max_queue_size: 2
)
# It's possible to subscribe to multiple topics by calling `subscribe`
# # repeatedly.
consumer.subscribe('test')
inv = 0
consumer.each_message do |message|
if inv < 200000
if inv % 10000 == 0
mb = GetProcessMem.new.mb
puts "#{ inv } - MEMORY USAGE(MB): #{ mb.round }"
end
inv = inv + 1
else
break
end
end
0 - MEMORY USAGE(MB): 34
10000 - MEMORY USAGE(MB): 41
20000 - MEMORY USAGE(MB): 40
30000 - MEMORY USAGE(MB): 40
40000 - MEMORY USAGE(MB): 40
50000 - MEMORY USAGE(MB): 40
60000 - MEMORY USAGE(MB): 40
70000 - MEMORY USAGE(MB): 40
80000 - MEMORY USAGE(MB): 42
90000 - MEMORY USAGE(MB): 42
100000 - MEMORY USAGE(MB): 42
110000 - MEMORY USAGE(MB): 42
120000 - MEMORY USAGE(MB): 42
130000 - MEMORY USAGE(MB): 42
140000 - MEMORY USAGE(MB): 42
it seems that the profiler is somehow aggregating a lot of data causing memory to bloat. Also keep in mind that due to prefetching, ruby-kafka will load a significant amount of data upfront if you don't setup the fetcher_max_queue_size
setting based on your setup.
I feel that this issue can be safely closed as it is invalid.
Ah! Setting lower fetcher_max_queue_size
solved the problem. Thanks @mensfeld.
Shouldn't we change the default 100 to something lower ?
@jimymodi it is changed in karafka framework: https://github.com/karafka/karafka/blob/master/lib/karafka/setup/config.rb#L86
@dasch seems to have a lot of memory available ;)
Steps to reproduce
Expected outcome
Memory should not be accumulating when consuming new messages.
Actual outcome
Memory consumption goes on increasing when consuming new messages.