fluent / fluent-plugin-prometheus

A fluent plugin that collects metrics and exposes for Prometheus.
Apache License 2.0
258 stars 79 forks source link

Fix resource leak in async wrapper #171

Closed eagletmt closed 4 years ago

eagletmt commented 4 years ago

Signed-off-by: Kohei Suzuki eagletmt@gmail.com


This patch fixes EMFILE error described in #168 and includes the same fix with #169 . It's also related to #170 (~not confirmed yet~ confirmed in our environment https://github.com/fluent/fluent-plugin-prometheus/issues/170#issuecomment-678782555).

ioquatix commented 4 years ago

I got asked about this issue.

response.read will read all chunks from the response into one big string. response.body.read will read one chunk from the response.

They are completely different. If you are expecting a large response body (e.g. multi-GiB) then you would want to avoid response.read.

Life cycle is very important in Async.

If you run inside a reactor, resources will persist because it's more efficient (by a significant amount). Thinks like HTTP/2 multiplexing and persistent connections should be used by default, unlike most (all?) other clients.

If you write client = Client.new(...) and you do not close it, and you do not leave async scope, and you hold onto a connected response, it will leave the connection open until you close the resource.

Most libraries will do something like:

string_buffer = client(...).get(url)

But async is streaming the response in real time, so the latency is reduced, and it looks more like

response_and_active_connection = client(...).get(url)

Therefore, you must be aware of this and close the response, or in the case of this PR, close the entire client (forcefully terminates the any active connections).

Depending on your use case, you can see significant benefits from using multiplexed connections. In one case, I saw sequential requests of 20s go to multiplexed requests taking only around 1s.

ioquatix commented 4 years ago

The current pool logic is implemented here but the default implementation doesn't do anything.

https://github.com/socketry/async-pool/blob/c5cdfe56db91680c561ffb070a6815cdafe06717/lib/async/pool/controller.rb#L153-L160

Ideally, we have a low water mark and high water mark, and we try to keep idle connections within some limit.

This method #prune would serve as the basis for the implementation/behaviour.

https://github.com/socketry/async-pool/blob/c5cdfe56db91680c561ffb070a6815cdafe06717/lib/async/pool/controller.rb#L115-L136

I welcome any ideas/feedback.

repeatedly commented 4 years ago

Reelased v1.8.3

ioquatix commented 4 years ago

Can you give me an update about memory usage?

eagletmt commented 4 years ago

In our environment, I've confirmed that upgrading fluent-plugin-prometheus to v1.8.3 has the same effect as "w/ async-http patched" version of https://github.com/fluent/fluent-plugin-prometheus/issues/170#issuecomment-678782555 .