oracle / truffleruby

A high performance implementation of the Ruby programming language, built on GraalVM.
https://www.graalvm.org/ruby/
Other
2.98k stars 180 forks source link

concurrent-ruby Fixed Thread Pool memory leak #3558

Open brauliobo opened 2 months ago

brauliobo commented 2 months ago

When running an update on a large dataset I'm seeing a big and steady memory leak with TruffleRuby. The code is proprietary but in essence it is based on a FixedThreadPool from concurrent-ruby:

def ds_waning_peach limit: 1_000, offset: nil
  ds = self.limit limit
  page = ds.all
  pool_run do |pool|
    begin 
      page.each{ |r| pool.post{ yield r } }
      sleep 1 while pool.queue_length > pool.max_length * 2
    end while (page = ds.offset(pool.queue_length).all).present?
  end
end

The above code runs on a large Sequel dataset with a PostgreSQL database.

TruffleRuby Native: steady and quite fast memory increase (40gb after 1h of CPU usage) image

TruffuleRuby JVM: slower but steady memory increase (25gb after 1h of CPU usage) image

In the screenshot above Ruby 3.1.2 compiled with jemalloc (pid 1006066) is also running with memory usage stabilized at around 1.2gb

So the same code is run with the 3 rubies, also with the same Gemfile.lock.

andrykonchin commented 2 months ago

Thank you for the report, we'll look into it.

eregon commented 2 months ago

Could you share a reproducer we can run? It's very difficult to diagnose this without being able to run it.

Note also that the JVM (and Native Image too BTW) uses up to its Xmx (which defaults to 25% of total physical RAM with G1 GC if no Xmx is passed) if it thinks it's more efficient for the GC/allocations/etc. So one thing you could try is to pass e.g. --vm.Xmx4g to e.g. tell it to use 4GB max. If that still uses way more memory than the Xmx it might be an issue with native memory allocation or maybe allocations by the JIT compiler.