I am infrequently observing high memory usage of otelcol.exe on Windows Systems.
The memory is however within the set bounds of memory_limiter (see config below). But I am not sure if this memory usage is intended behavior, since there is negligible load on the systems. I am not experienced in golang so please excuse my lack of knowledge.
My planned actions to counteract this behavior is reduce the configured limit_mib and fix the memory_ballast.
Steps to reproduce
?? Full config see below.
What did you expect to see?
otelcol should not consume "much" memory given the workload is little, and/or memory should eventually be garbage collected?
What did you see instead?
otelcol process uses (?) up to 2GB of memory. Currently on my local machine here it sits at ~ 1.5GB.
Last stdout log message from memory_limiter after machine was started from hibernation:
{"level":"info","ts":1652033885.9450645,"caller":"memorylimiterprocessor/memorylimiter.go:281","msg":"Memory usage after GC.","kind":"processor","name":"memory_limiter","pipeline":"traces","cur_mem_mib":197}
{"level":"info","ts":1652033885.8394392,"caller":"memorylimiterprocessor/memorylimiter.go:310","msg":"Memory usage is above soft limit. Forcing a GC.","kind":"processor","name":"memory_limiter","pipeline":"traces","cur_mem_mib":1623}
Is this the intended behavior of memory usage of the process?
On my machine, the prometheus exporter currently emits ~ 19362 lines of metrics, less than 3 MB in size.
On our biggest instance, the prometheus exporter has 37654 metric lines , ~ 8MB.
Is this a lot?
The documentation on that page uses limit_mib: 4000 , which seems kinda HUGE for this kind of application?
After re-reading those docs just now, this line caught my attension:
limit_mib (default = 0): Maximum amount of memory, in MiB, targeted to be allocated by the process heap. Note that typically the total memory usage of process will be about 50MiB higher than this value. This defines the hard limit.
So it actually is the expected behavor of otelcol to stay around limit_mib indefinetely?
My trace queue size is currently at zero:
# HELP otelcol_exporter_queue_size Current size of the retry queue (in batches)
# TYPE otelcol_exporter_queue_size gauge
otelcol_exporter_queue_size{exporter="jaeger",service_instance_id="fc4ff71e-1a39-4b13-9990-db1ec97e9c9b",service_version="latest"} 0
I had the suspicion the trace buffer was filling memory when VPN was disconnected at jaeger server not reachable, since the default queue size of 5000 (?) seemed rather high.
Looking at logs from last night, it has been the traces responsible for causing the memory increase:
2022-05-07T01:22:01+02:00 {"level":"info","ts":1651879321.2295015,"caller":"memorylimiterprocessor/memorylimiter.go:303","msg":"Memory usage back within limits. Resuming normal operation.","kind":"processor","name":"memory_limiter","pipeline":"traces","cur_mem_mib":182}
2022-05-07T01:22:01+02:00 {"level":"info","ts":1651879321.2295015,"caller":"memorylimiterprocessor/memorylimiter.go:281","msg":"Memory usage after GC.","kind":"processor","name":"memory_limiter","pipeline":"traces","cur_mem_mib":182}
2022-05-07T01:20:19+02:00 {"level":"info","ts":1651879219.4642153,"caller":"memorylimiterprocessor/memorylimiter.go:281","msg":"Memory usage after GC.","kind":"processor","name":"memory_limiter","pipeline":"traces","cur_mem_mib":1665}
2022-05-07T01:18:37+02:00 {"level":"info","ts":1651879117.4491055,"caller":"memorylimiterprocessor/memorylimiter.go:281","msg":"Memory usage after GC.","kind":"processor","name":"memory_limiter","pipeline":"traces","cur_mem_mib":1661}
2022-05-07T01:16:55+02:00 {"level":"info","ts":1651879015.4828515,"caller":"memorylimiterprocessor/memorylimiter.go:281","msg":"Memory usage after GC.","kind":"processor","name":"memory_limiter","pipeline":"traces","cur_mem_mib":1664}
2022-05-07T01:15:30+02:00 {"level":"info","ts":1651878930.5143886,"caller":"memorylimiterprocessor/memorylimiter.go:281","msg":"Memory usage after GC.","kind":"processor","name":"memory_limiter","pipeline":"traces","cur_mem_mib":1683}
2022-05-07T01:14:05+02:00 {"level":"info","ts":1651878845.4196262,"caller":"memorylimiterprocessor/memorylimiter.go:281","msg":"Memory usage after GC.","kind":"processor","name":"memory_limiter","pipeline":"traces","cur_mem_mib":1683}
2022-05-07T01:22:01+02:00 {"level":"warn","ts":1651879321.100273,"caller":"memorylimiterprocessor/memorylimiter.go:291","msg":"Memory usage is above hard limit. Forcing a GC.","kind":"processor","name":"memory_limiter","pipeline":"traces","cur_mem_mib":2059}
2022-05-07T01:20:19+02:00 {"level":"warn","ts":1651879219.098343,"caller":"memorylimiterprocessor/memorylimiter.go:291","msg":"Memory usage is above hard limit. Forcing a GC.","kind":"processor","name":"memory_limiter","pipeline":"traces","cur_mem_mib":2049}
2022-05-07T01:18:37+02:00 {"level":"warn","ts":1651879117.0985374,"caller":"memorylimiterprocessor/memorylimiter.go:291","msg":"Memory usage is above hard limit. Forcing a GC.","kind":"processor","name":"memory_limiter","pipeline":"traces","cur_mem_mib":2048}
2022-05-07T01:16:55+02:00 {"level":"warn","ts":1651879015.103772,"caller":"memorylimiterprocessor/memorylimiter.go:291","msg":"Memory usage is above hard limit. Forcing a GC.","kind":"processor","name":"memory_limiter","pipeline":"traces","cur_mem_mib":2012}
2022-05-07T01:15:30+02:00 {"level":"warn","ts":1651878930.103912,"caller":"memorylimiterprocessor/memorylimiter.go:291","msg":"Memory usage is above hard limit. Forcing a GC.","kind":"processor","name":"memory_limiter","pipeline":"traces","cur_mem_mib":2005}
2022-05-07T01:14:05+02:00 {"level":"warn","ts":1651878845.099606,"caller":"memorylimiterprocessor/memorylimiter.go:291","msg":"Memory usage is above hard limit. Forcing a GC.","kind":"processor","name":"memory_limiter","pipeline":"traces","cur_mem_mib":2061}
If the sending_queuequeue_size of 5000 does not fit within memory limits, what is going to happen?
Oldest trace spans are going to get dropped from queue?
I did however also have on machine where otelcol used 2GB of memory suddenly, and no traces where being queued.
No warnings at the time, unfortunately I have not info logs from that date (otelcol was still windows application event logging)
There are no apps emitting traces on that machine yet, so no Idea what has happend there.
Describe the bug
I am infrequently observing high memory usage of
otelcol.exe
on Windows Systems. The memory is however within the set bounds ofmemory_limiter
(see config below). But I am not sure if this memory usage is intended behavior, since there is negligible load on the systems. I am not experienced in golang so please excuse my lack of knowledge.My planned actions to counteract this behavior is reduce the configured
limit_mib
and fix thememory_ballast
.Steps to reproduce
?? Full config see below.
What did you expect to see?
otelcol
should not consume "much" memory given the workload is little, and/or memory should eventually be garbage collected?What did you see instead?
otelcol
process uses (?) up to 2GB of memory. Currently on my local machine here it sits at ~ 1.5GB. Last stdout log message frommemory_limiter
after machine was started from hibernation:Is this the intended behavior of memory usage of the process?
What are recommended values for
memory_limiter
? The documentation is a little vague.On my machine, the
prometheus
exporter currently emits ~ 19362 lines of metrics, less than 3 MB in size. On our biggest instance, the prometheus exporter has 37654 metric lines , ~ 8MB. Is this a lot?The documentation on that page uses
limit_mib: 4000
, which seems kinda HUGE for this kind of application?After re-reading those docs just now, this line caught my attension:
So it actually is the expected behavor of otelcol to stay around
limit_mib
indefinetely?My trace queue size is currently at zero:
I had the suspicion the trace buffer was filling memory when VPN was disconnected at
jaeger
server not reachable, since the default queue size of 5000 (?) seemed rather high.Looking at logs from last night, it has been the
traces
responsible for causing the memory increase:If the
sending_queue
queue_size
of 5000 does not fit within memory limits, what is going to happen? Oldest trace spans are going to get dropped from queue?I did however also have on machine where otelcol used 2GB of memory suddenly, and no traces where being queued. No warnings at the time, unfortunately I have not info logs from that date (otelcol was still windows application event logging)
There are no apps emitting traces on that machine yet, so no Idea what has happend there.
I have collected and attached various debug information from http://localhost:1777/debug/pprof/, in case the memory usage is not okay.
What version did you use?
What config did you use?
I am using two config files (currently 3 two debug) consisting of the following parts:
ConfigFile 1 (common and metrics):
I just noticed an error in my config... the
memory_ballast
extension is not used it seems..Second File used optionally (if traces are required):
Third config file currently used to add
pprof
:Environment OS: Windows 10 21H2, Windows Server 2019