Open amit12cool opened 9 months ago
I think you may have to bump the default tls { timeout = X }
and authorization { timeout = X }
for your deployment, have you tried tuning those?
I think you may have to bump the default
tls { timeout = X }
andauthorization { timeout = X }
for your deployment, have you tried tuning those?
@wallyqs where can I bump this in the https://github.com/nats-io/k8s/blob/main/helm/charts/nats/values.yaml helm chart.
Also I didn't find a timeout option here https://docs.nats.io/running-a-nats-service/configuration/securing_nats/authorization
@wallyqs with timeout = 50 in tls
the timeout error was removed. But I see tls handhsake errors like below
[7] 2024/03/13 19:50:46.143685 [ERR] Error trying to connect to route (attempt 1): lookup for host "nats-1.nats-headless": lookup nats-1.nats-headless on 10.0.0.10:53: no such host
[7] 2024/03/13 19:50:46.143905 [ERR] Error trying to connect to route (attempt 1): lookup for host "nats-0.nats-headless": lookup nats-0.nats-headless on 10.0.0.10:53: no such host
[7] 2024/03/13 19:50:46.172442 [ERR] Error trying to connect to route (attempt 1): lookup for host "nats-0.nats-headless": lookup nats-0.nats-headless on 10.0.0.10:53: no such host
[7] 2024/03/13 19:50:46.173120 [ERR] Error trying to connect to route (attempt 1): lookup for host "nats-1.nats-headless": lookup nats-1.nats-headless on 10.0.0.10:53: no such host
[7] 2024/03/12 18:49:37.957961 [ERR] Error trying to connect to route (attempt 1): lookup for host "nats-0.nats-headless": lookup nats-0.nats-headless on 10.0.0.10:53: no such host
[7] 2024/03/12 17:40:23.567594 [ERR] 10.240.0.4:59217 - cid:44 - TLS handshake error: read tcp 10.244.2.14:4222->10.240.0.4:59217: read: connection reset by peer
Points to note:-
kubectl exec -it nats-0 -- nslookup nats-1.nats-headless.nats-playground-3.svc.cluster.local
it gives output
Server: 10.0.0.10
Address: 10.0.0.10:53
Name: nats-1.nats-headless.nats-playground-3.svc.cluster.local Address: 10.244.3.198
So above is resolved to the address `10.244.3.198` from outside pod but not inside from pod.
- I have no client connected to the nats server i.e. any of the nats POD in k8
- `10.244.2.14:4222` I know this is my nats pod server ip. But I don't know whose Ip is this `0.240.0.4:59217`. any idea, with whom the tcp connection is being made from bats server?
-
I have below nats.conf
{ "cluster": { "name": "nats", "no_advertise": true, "port": 6222, "routes": [ "nats://nats-0.nats-headless:6222", "nats://nats-1.nats-headless:6222", "nats://nats-2.nats-headless:6222" ] }, "http_port": 8222, "jetstream": { "max_file_store": 10Gi, "max_memory_store": 0, "store_dir": "/data" }, "lame_duck_duration": "30s", "lame_duck_grace_period": "10s", "pid_file": "/var/run/nats/nats.pid", "port": 4222, "server_name": $SERVER_NAME, "tls": { "ca_file": "/mnt/nats-certificate/rootCA-playground.crt", "cert_file": "/mnt/nats-certificate/nats-playground-server.crt", "key_file": "/mnt/nats-certificate/nats-playground-server.key", "timeout": 50, "verify": true } }
Here is my helm chart
config: cluster: enabled: true replicas: 3 port: 6222 jetstream: enabled: true fileStore: pvc: size: 10Gi nats: tls: enabled: true merge: { verify: true, cert_file: '/mnt/nats-certificate/nats-playground-server.crt', key_file: '/mnt/nats-certificate/nats-playground-server.key', ca_file: '/mnt/nats-certificate/rootCA-playground.crt', timeout: 50 }
podTemplate: topologySpreadConstraints: kubernetes.io/hostname: maxSkew: 1 whenUnsatisfiable: DoNotSchedule patch:
service: merge: spec: type: LoadBalancer
promExporter: enabled: true podMonitor: enabled: true
container: image: repository: nats image: 2.10.11-alpine patch:
reloader: patch:
@neilalexander @wallyqs please check above. This is critical for me.
Observed behavior
I have a publisher publishing message to NATS server. The publisher is an Azure function publishing messages at regular intervals.
I see errors in nats(server) containers as below
Expected behavior
No TLS errors should be there.
Server and client version
NATS version - 2.10.4-alpine Client -> nats-py==2.6.0
Host environment
K8's
Steps to reproduce
TLS auth