heroku / l2meter

L2met-friendly log formatter
MIT License
4 stars 2 forks source link

Memory leak in ThreadSafe proxy #6

Closed stevenharman closed 5 years ago

stevenharman commented 5 years ago

I believe we have a memory leak in the L2meter::ThreadSafe proxy which is leaking instances of L2meter::Emitter into the Thread.current[].

The ThreadSafe proxies most requests to its internal Emitter instance. It does so by delegating methods through a #receiver, which will grab the thread-local instance of an Emitter via the #current_emitter. BUT current_emitter builds the cache key using the wrapped Emitter instance, which could be different than the original Emitter instance. Further, every ThreadSafe#context calls results in a new instance of ThreadSafe, with an unset (nil) @thread_key. So the first proxied method on a new instance of ThreadSafe will add another Emitter instance to the Thread.current[], under a new key (b/c the Emitter is a clone).

The important code from ThreadSafe, followed by a minimal script to reproduce.

# lib/l2meter/thread_safe.rb
class ThreadSafe
  def_delegators :receiver, *Emitter.instance_methods(false)

  def initialize(emitter)
    @emitter = emitter.freeze
  end

  def context(*args, &block)
    value = current_emitter.context(*args, &block)
    Emitter === value ? clone_with_emitter(value) : value
  end

  private

  def clone_with_emitter(emitter)
    self.class.new(emitter)
  end

  def current_emitter
    Thread.current[thread_key] ||= @emitter.clone
  end

  def thread_key
    @thread_key ||= "_l2meter_emitter_#{@emitter.object_id}".freeze
  end
end

A minimal script to reproduce, named leak_emitter_in_thread:

#!/usr/bin/env ruby

require "get_process_mem"
require "2meter"

def count_thread_keys
  Thread.current.keys.count { |k| String(k).start_with?('_l2meter_emitter_') }
end

log = L2meter.build
log.silence! # keep the output quiet; leak happens either way

puts "Cold boot: #{GetProcessMem.new.mb}; Thread Keys: #{count_thread_keys}"

5.times do
  10_000.times do
    # Set a new context, which will clone the emitter (a few times)
    # and return new instance of `L2meter::ThreadSafe`, proxying to cloned emitter
    contextual_log = log.context(:bar)

    # Call a method that's proxied to the wrapped `Emitter` instance, causing
    # `ThreadSafe` to make the following calls:
    #   -> `#reciever` -> `#current_emitter`
    # which will add a new key and value to `Thread.current[]` b/c the `#emitter`
    # instance used to generate the `#thread_key` is the clone (thus different `#object_id`)
    # rather than the original emitter used on this thread.
    contextual_log.log('hi')
  end
  puts "Thread Keys: #{count_thread_keys}; Memory: #{GetProcessMem.new.mb}"
end

puts "Final: #{GetProcessMem.new.mb}"

Running the script:

$ leak_emitter_in_thread
Cold boot: 17.65625; Thread Keys: 1
Thread Keys: 4968; Memory: 23.83203125
Thread Keys: 9645; Memory: 30.1875
Thread Keys: 13181; Memory: 35.76171875
Thread Keys: 16185; Memory: 37.9765625
Thread Keys: 20779; Memory: 44.46484375
Final: 44.46484375

I'm not sure what the intended/desired behavior of this library is in terms of wether we should be using the same #thread_key and then updating Thread.current[]= with the new emitter. Or if we should be delegating differently?

Any advice, @rwz?

rwz commented 5 years ago

I'm not sure what the intended/desired behavior of this library is in terms of wether we should be using the same #thread_key and then updating Thread.current[]= with the new emitter. Or if we should be delegating differently?

The goal is to prevent a situation where the same Emitter is used from two different threads and have mixed context. Like this:

logger = L2meter.build

thread_a = Thread.new do
  1_000.times do
    logger.context :bar do
      logger.log :hi
    end
  end
end

thread_b = Thread.new do
  1_000.times do
    logger.log :bye
  end
end

[thread_a, thread_b].each &:join

So, in this situation we need to make sure that bye is never mixed with bar context. It might actually make sense to add a test for this to make sure we can play with this code and not accidentally break things.

Honestly at this point I have no idea how to fix this memory leak without changing external API or making usage ergonomics worse.

rwz commented 5 years ago

I have a spec for this ^^^, but can't push it since I lost access to the repo as I was kicked out of heroku org. Do you mind adding me back as a collab on this repo? :)

rwz commented 5 years ago

Should be addressed by #7