Closed AMarti96 closed 11 months ago
To provide more detail, I was able to extract the number of connections in our current NATS server(running inside a Kuberentes cluster and populating Streams/Consumers via NACK). Using nats-top I was able to get the following
as you can see, all of them (the 95 connections in this specific screenshot) are from jetstream-controller
, which is NACK creating the Streams/Consumers and never disconnecting
The problem seems to be only when the connection to NATS is defined in the account
CRD. In the code, that means when crdConnect
is set to true
.
When setting one NATS connection in the overall server settings (crdConnect
set to false) it doesn't matter how many objects I create or how many times the connection is retried, only 1 connection is reported:
With that in mind, I think the error may come in this part of the code:
Thanks for reporting! We should be able to put a connection pooler into NACK to prevent this. There is already an implementation in the nats-surveyor
repo.
We'll port it over should be able to get that done next week
Connection pooling reference from nats-surveyor: https://github.com/nats-io/nats-surveyor/blob/main/surveyor/conn_pool.go
Connection pool added in v0.14.0
Hello team! We have been using NACK for a while towards a NATS server located in Kubernetes, but today we started the migration towards Synadia Cloud as it will avoid us the maintenance of the NATS Cluster.
But, when trying to integrate our current NACK CRDs (creating tens of subjects and consumers for 10 different accounts), we started to receive errors from our instance. After a bit of debugging we realized the problem seems to be on how NACK is handling the connections towards the server.
Any suggestion or workaround other than killing the NACK instance each time to restart the connection count is appreciated!
What version were you using?
Using Synadia Cloud instance to allocate the NATS Server, where connections are limited to a certain amount per account.
What environment was the server running in?
Running NACK v0.13.0
Using the following image: natsio/jetstream-controller:0.13.0
Is this defect reproducible?
Yes, it is
Create a new account in Synadia Cloud (free tier is enough). Then, start up a NACK connected to that account and try to create one stream and one consumer.
For the NACK creation
Then create the following resources for NACK to process
Once applied, in the logs from NACK I can see them correctly created
The same goes for Synadia UI, I can see them.
But the
Connections
count is kept at2
and never goes down (waited for more than 1h and nothing).Similarly, if the Stream/Consumer has any kind of typo in the spec, NACK opens an infinite amount of connections during the retries trying to reconcile, which makes the Synadia Account stop processing the connections.
Given the capability you are leveraging, describe your expectation?
I would expect NACK to only use 1 connection to NATS given a set of resources all pointing to the same account, and not create a new connection for each time the reconcile loop is processed.
Given the expectation, what is the defect you are observing?
More connections than necessary are created in NACK and old connections are never closed.