Open FLemon opened 6 years ago
this is found on a delayed_job worker, working off massive queue of jobs. and 10,000 jobs will cause 10,000 threads leaks
@FLemon We have run into this as well, which in my development environment on macOS actually causes the process to run out of Threads (default max threads seems to be much lower in macOS than linux, but haven't found exactly where that's set).
What's worked for us is adding reaping_frequency: 0
to our database.yml. We also needed to add it to our config.tenant_names
hash in the apartment initializer, because we're not using schemas. Here's the part of ActiveRecord responsible for the reaper:
As you can see, setting the frequency to 0 causes the threaded part of the code to be skipped.
This is the first time I've really dug into ActiveRecord internals, so I may have some of these details wrong, but this is the explanation I've come up with so far.
Anytime Apartment establishes a connection, a new connection pool seems to be created, and the connection pool creates this Reaper thread. There appears to also be a memory leak of connection pools, at least when reaping_frequency
is not set to 0. I'm currently trying to figure out if this is also the case when disabling the reaper, but if it is still leaking memory, it is happening much, much more slowly.
Another interesting thing to note is that if you have excluded models, another connection pool is created for each one. In our case, we have 15 excluded models, which means every request was creating 17 leaked threads (15 for excluded models, 1 for the elevator switching to the desired tenant, and 1 more when the elevator switched back to the default tenant). That causes the thread count to max out very quickly. Considering these excluded models are all accessing the same "global" database, I would have expected them to share the connection pool, but this does not seem to be the case.
The excluded model issue, however, only appears to happen in development
(or at least it exhibits the most extreme bad behavior in development). Apartment is using the config.to_prepare
block in its Railtie:
https://github.com/influitive/apartment/blob/development/lib/apartment/railtie.rb#L29
I looked this config block up, and it says:
Run after the initializers are run for all Railties (including the application itself), but before eager loading and the middleware stack is built. More importantly, will run upon every request in development, but only once (during boot-up) in production and test.
https://guides.rubyonrails.org/configuring.html#initialization-events
So, assuming I'm correct in that a connection pool is created for every excluded model, and that there is a memory leak of those connection pools, then every request/job in the development environment is leaking an additional N connection pools, where N is the number of excluded models. When reaping is enabled, this also results in N leaked threads.
@adamkrone although i'm using this gem with config.use_schemas = false, I also experienced the issue with excluded models using one connection for each model.
It can also cause issues with associations since Rails tries to create associated records in different transactions/connections and they will fail.
Here is one way to fix this, in case it helps someone:
class BaseSharedModel < ActiveRecord::Base
self.abstract_class = true
establish_connection Rails.configuration.database_configuration[Rails.env]
end
class User < BaseSharedModel
end
class AnotherModel < BaseSharedModel
end
At least with mysql2 + use_schemas = false, this will make those models share the same connection.
we end up switch to using schemas, that stops AR from creating new thread for a DB switch, as all schemas are inside the same DB
just to clarify, is this only an issue with use_schemas = false
? your original issue report states use_schema = true
.
this is caused by a rails bug which causes a thread leak in the Reaper (as some of you had already understood), when a new connection pool is created.
Steps to reproduce
top -Hc
Apartment::Tenant.switch('tenant_name') { p '' }
Expected behavior
cache or somehow reuse already established connection when doing
#switch
, to avoid threads leakActual behavior
observe threads increased by 2, one for switching to, one for switching back, regardless the value of switching to and switching back are the same.
System configuration
Database: postgresSQL 9.6
Apartment version: 2.1
Apartment config (in
config/initializers/apartment.rb
or so):use_schemas
:true
Rails (or ActiveRecord) version: 5.1.6
Ruby version: 2.5