toptal / chewy

High-level Elasticsearch Ruby framework based on the official elasticsearch-ruby client
MIT License
1.88k stars 364 forks source link

Closing open connections to Elasticsearch #951

Open ClearlyClaire opened 3 months ago

ClearlyClaire commented 3 months ago

We use Chewy and import documents in a custom Sidekiq worker. We also use Sidekiq for other purposes, and it is expected that some unrelated jobs will sometimes fail.

This means a single Sidekiq worker process will regularly close threads and start new ones (as Sidekiq does on job failure), therefore resulting in new Chewy.client instances.

One thing we noticed though, is that the underlying Elasticsearch connections are only closed when Ruby's garbage collector collects the dead thread's Chewy.client instance, which seems to be the cause of a file descriptor leak in our application.

We believe we have found a way to close these connections by adding the following code to the error handler in a custom Sidekiq middleware:

Chewy.client.transport.transport.connections.each do |connection|
  # This bit of code is tailored for the HTTPClient Faraday adapter
  connection.connection.app.instance_variable_get(:@client)&.reset_all
end

However, this piece of code breaks multiple layers of abstractions, going through chewy, elasticsearch, elasticsearch-transport, faraday and faraday-httpclient, even accessing an otherwise unexposed instance variable at one point.

Is there a better way to manually close Chewy's connections to Elasticsearch? Are we missing something obvious about their lifecycle?


Digging into it, my understanding of the issue is that neither chewy, elasticsearch nor elasticsearch-transport provide a method to close connections.

It looks like faraday has Faraday::Connection#close but that appears to not actually be implemented in most adapters, and in particular not in the faraday-httpclient adapter that ends up being used in our app.

I opened a similar issue for the elasticsearch gem: https://github.com/elastic/elasticsearch-ruby/issues/2389

Of course, I may have missed something, and would be glad to know what if that's the case!