Closed pablitovicente closed 11 months ago
The NATS server is written in Go, and hence needs to have information on the memory limits beyond container limits. This is handled in the new HELM chart AFAIK.
If that is not set for instance, and the machine has lets say 64G, and the container is limited to say 4G. The NATS server actually sees 64G and that is what it uses to determine when to run the GC etc.
So setting the env variable could kick in the GC which will return memory to the kernel (depending on your kernel version).
https://weaviate.io/blog/gomemlimit-a-game-changer-for-high-memory-applications
Thanks @derekcollison I will look into that.
A pprof capture in case it is useful.
go tool pprof "http://localhost:50001/debug/pprof/heap"
Fetching profile over HTTP from http://localhost:50001/debug/pprof/heap
Saved profile in /home/ronin/pprof/pprof.nats-server.alloc_objects.alloc_space.inuse_objects.inuse_space.001.pb.gz
File: nats-server
Type: inuse_space
Time: Aug 15, 2023 at 7:16pm (CEST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 449.52MB, 99.56% of 451.52MB total
Dropped 22 nodes (cum <= 2.26MB)
Showing top 10 nodes out of 30
flat flat% sum% cum cum%
265.90MB 58.89% 58.89% 265.90MB 58.89% github.com/nats-io/nats-server/v2/server.subjFromBytes
171.50MB 37.98% 96.87% 444.02MB 98.34% github.com/nats-io/nats-server/v2/server.(*fileStore).populateGlobalPerSubjectInfo
5.51MB 1.22% 98.09% 5.51MB 1.22% runtime.allocm
3.61MB 0.8% 98.89% 272.51MB 60.35% github.com/nats-io/nats-server/v2/server.(*msgBlock).readPerSubjectInfo
3.01MB 0.67% 99.56% 3.01MB 0.67% os.ReadFile
0 0% 99.56% 444.02MB 98.34% github.com/nats-io/nats-server/v2/server.(*Account).EnableJetStream
0 0% 99.56% 444.02MB 98.34% github.com/nats-io/nats-server/v2/server.(*Account).addStream
0 0% 99.56% 444.02MB 98.34% github.com/nats-io/nats-server/v2/server.(*Account).addStreamWithAssignment
0 0% 99.56% 444.02MB 98.34% github.com/nats-io/nats-server/v2/server.(*Server).EnableJetStream
0 0% 99.56% 444.02MB 98.34% github.com/nats-io/nats-server/v2/server.(*Server).Start
Which lead me to suspect subjFromBytes was related to subjects and in fact the stream seems to have a subject per key?
nats stream info KV_status
Information for Stream KV_status created 2023-08-15 12:59:40
Subjects: $KV.status.>
Replicas: 1
Storage: File
Options:
Retention: Limits
Acknowledgements: true
Discard Policy: New
Duplicate Window: 2m0s
Direct Get: true
Allows Msg Delete: true
Allows Purge: true
Allows Rollups: true
Limits:
Maximum Messages: unlimited
Maximum Per Subject: 5
Maximum Bytes: unlimited
Maximum Age: unlimited
Maximum Message Size: unlimited
Maximum Consumers: unlimited
Cluster Information:
Name: IIOT
Leader: NATS-1
State:
Messages: 5,090,414
Bytes: 860 MiB
FirstSeq: 1 @ 2023-08-15T10:59:40 UTC
LastSeq: 5,090,414 @ 2023-08-15T11:05:42 UTC
Active Consumers: 0
Number of Subjects: 5,090,414
So you have 5M+ keys.. We use memory to store those and do not drop those from memory in 2.9.x but are looking at getting better at dropping some of this for idle assets in 2.10.
Yeah I am kind of stress testing but I would expect to have a couple of million keys for my use case. I am considering using the K/V store for doing status tracking of objects that are state machines that report the state transitions and I need to keep some history for them.
NATs k/v store seems like a very good choice for my use case and the flexibility of either native NATS connection or Websockets for transport matches my use case very well.
So I followed the linked article by setting GOMEMLIMIT
in my docker-compose and that seems to keep memory at bay and performance seems to be quite similar and it is reasonable to expect some higher latency from GC.
I guess we can close this one then as it is expected behavior.
Thank you very much for your help.
We should have some improvements on 2.10 as well.
Defect
I am testing NATS Jetstream and Key/Value Store and I am seeing that memory consumption does not go down after all activity against the NATS server stops. Not even waiting up till half an hour changes.
nats-server -DV
outputVersions of
nats-server
and affected client libraries used:OS/Container environment:
Steps or code to reproduce the issue:
Expected result:
Actual result: