elastic / beats

:tropical_fish: Beats - Lightweight shippers for Elasticsearch & Logstash
https://www.elastic.co/products/beats
Other
12.06k stars 4.89k forks source link

[filebeat] memory leak when embedded into user application #39865

Open blanche789 opened 3 weeks ago

blanche789 commented 3 weeks ago

Through the source code of filebeat, I integrated filebeat into my own code and could collect logs smoothly. However, as time went by, I found that the memory continued to grow, I suspect there is a memory leak and I could not see the problem through pprof analysis.

The last commit of integrated filebeat is 6cb79c0b061f8663cfe1f143aacc602a923ee9e1

My filebeat.yaml configuration is as follows: image

My sub-filebeat.yaml configuration is as follows: image

This is the memory trend graph of the log collection program: image

Below is my pprof analysis chart: profile001

elasticmachine commented 3 weeks ago

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

cmacknz commented 3 weeks ago

That you are embedded Filebeat in another application complicates this a bit. This use case falls outside of what we'd typically support or investigate, but we can do a quick investigation as memory leaks are a severe enough problem I want to double check this isn't going to come back to haunt us.

What is the time scale of the graph you posted? How long did it take to grow to the 200+ MB mark and does it stabilize?

The inuse_space profile you attached is not that interesting. Does anything jump out on the alloc_space profile as allocating excessively frequently?

It is also possible the leak is somewhere the Go runtime can't see it, for example you are leaking resources or failing to close handles at the OS level.

blanche789 commented 3 weeks ago

That you are embedded Filebeat in another application complicates this a bit. This use case falls outside of what we'd typically support or investigate, but we can do a quick investigation as memory leaks are a severe enough problem I want to double check this isn't going to come back to haunt us.

What is the time scale of the graph you posted? How long did it take to grow to the 200+ MB mark and does it stabilize?

The inuse_space profile you attached is not that interesting. Does anything jump out on the alloc_space profile as allocating excessively frequently?

It is also possible the leak is somewhere the Go runtime can't see it, for example you are leaking resources or failing to close handles at the OS level.

Thank you for your answer. Due to business needs, we need to embed filebeat into the program. The following is the analysis diagram of alloc_space. I customized a receiver, and then I used pgzip to compress the message. profile002

Memory increased to 200+MB from June 7th to June 12th image