Closed YoninL closed 5 years ago
I noticed that the nats operator restarted a few times.
nats-operator-bb6856fbd-tm2fl 0/1 CrashLoopBackOff 5 39m
the logs is:
time="2018-08-02T07:06:29Z" level=info msg="nats-operator Version: 0.2.3-v1alpha2+git"
time="2018-08-02T07:06:29Z" level=info msg="Git SHA: d88048a"
time="2018-08-02T07:06:29Z" level=info msg="Go Version: go1.9"
time="2018-08-02T07:06:29Z" level=info msg="Go OS/Arch: linux/amd64"
time="2018-08-02T07:06:29Z" level=info msg="Event(v1.ObjectReference{Kind:\"Endpoints\", Namespace:\"nats-io\", Name:\"nats-operator\", UID:\"4677238c-961b-11e8-9e22-0e8d5bae827c\", APIVersion:\"v1\", ResourceVersion:\"47875736\", FieldPath:\"\"}): type: 'Normal' reason: 'LeaderElection' nats-operator-bb6856fbd-tm2fl became leader"
time="2018-08-02T07:06:29Z" level=info msg="finding existing clusters..." pkg=controller
time="2018-08-02T07:06:29Z" level=info msg="starts running from watch version: 47875737" pkg=controller
time="2018-08-02T07:06:29Z" level=info msg="start running..." cluster-name=example-nats pkg=cluster
time="2018-08-02T07:06:29Z" level=info msg="start watching at 47875737" pkg=controller
time="2018-08-02T07:07:45Z" level=info msg="apiserver closed watch stream, retrying after 5s..." pkg=controller
time="2018-08-02T07:07:50Z" level=info msg="start watching at 47875737" pkg=controller
time="2018-08-02T07:09:35Z" level=info msg="apiserver closed watch stream, retrying after 5s..." pkg=controller
time="2018-08-02T07:09:40Z" level=info msg="start watching at 47875737" pkg=controller
time="2018-08-02T07:11:34Z" level=info msg="apiserver closed watch stream, retrying after 5s..." pkg=controller
time="2018-08-02T07:11:39Z" level=info msg="start watching at 47875737" pkg=controller
time="2018-08-02T07:12:42Z" level=info msg="apiserver closed watch stream, retrying after 5s..." pkg=controller
time="2018-08-02T07:12:47Z" level=info msg="start watching at 47875737" pkg=controller
time="2018-08-02T07:13:58Z" level=info msg="apiserver closed watch stream, retrying after 5s..." pkg=controller
time="2018-08-02T07:14:03Z" level=info msg="start watching at 47875737" pkg=controller
time="2018-08-02T07:15:05Z" level=info msg="apiserver closed watch stream, retrying after 5s..." pkg=controller
E0802 07:15:05.734121 1 leaderelection.go:240] error retrieving resource lock nats-io/nats-operator: Get https://10.235.0.1:443/api/v1/namespaces/nats-io/endpoints/nats-operator: unexpected EOF
time="2018-08-02T07:15:10Z" level=info msg="start watching at 47875737" pkg=controller
time="2018-08-02T07:15:10Z" level=info msg="start watching at 47883860" pkg=controller
E0802 07:15:23.001413 1 leaderelection.go:274] Failed to update lock: Put https://10.235.0.1:443/api/v1/namespaces/nats-io/endpoints/nats-operator: read tcp 10.235.67.50:56802->10.235.0.1:443: read: connection reset by peer
time="2018-08-02T07:15:23Z" level=info msg="received signal: broken pipe"
Thanks, could you post some of the logs and kubectl describe from the nats-streaming-operator if possible? I'll take a look
nats streaming operator logs:
time="2018-08-02T06:33:49Z" level=info msg="Starting NATS Streaming Operator v0.1.0"
time="2018-08-02T06:33:49Z" level=info msg="Go Version: go1.9"
time="2018-08-02T06:33:49Z" level=info msg="operator-sdk Version: 0.0.5+git"
time="2018-08-02T06:33:49Z" level=info msg="Watching streaming.nats.io/v1alpha1, NatsStreamingCluster, nats-io, 5"
time="2018-08-02T06:34:06Z" level=info msg="Missing pods for \"example-stan\" cluster (0/3), creating 3 pods..."
time="2018-08-02T06:34:11Z" level=info msg="Missing pods for \"example-stan\" cluster (0/3), creating 3 pods..."
time="2018-08-02T06:34:16Z" level=info msg="Missing pods for \"example-stan\" cluster (0/3), creating 3 pods..."
time="2018-08-02T06:34:21Z" level=info msg="Missing pods for \"example-stan\" cluster (0/3), creating 3 pods..."
time="2018-08-02T06:34:26Z" level=info msg="Missing pods for \"example-stan\" cluster (0/3), creating 3 pods..."
time="2018-08-02T06:34:31Z" level=info msg="Missing pods for \"example-stan\" cluster (0/3), creating 3 pods..."
time="2018-08-02T06:34:36Z" level=info msg="Missing pods for \"example-stan\" cluster (0/3), creating 3 pods..."
time="2018-08-02T06:34:41Z" level=info msg="Missing pods for \"example-stan\" cluster (0/3), creating 3 pods..."
time="2018-08-02T06:34:46Z" level=info msg="Missing pods for \"example-stan\" cluster (0/3), creating 3 pods..."
time="2018-08-02T06:34:51Z" level=info msg="Missing pods for \"example-stan\" cluster (0/3), creating 3 pods..."
time="2018-08-02T06:34:56Z" level=info msg="Missing pods for \"example-stan\" cluster (0/3), creating 3 pods..."
time="2018-08-02T06:35:01Z" level=info msg="Missing pods for \"example-stan\" cluster (1/3), creating 2 pods..."
time="2018-08-02T06:35:06Z" level=info msg="Missing pods for \"example-stan\" cluster (1/3), creating 2 pods..."
time="2018-08-02T06:38:07Z" level=info msg="Missing pods for \"example-stan\" cluster (0/3), creating 3 pods..."
time="2018-08-02T06:38:11Z" level=info msg="Missing pods for \"example-stan\" cluster (1/3), creating 2 pods..."
time="2018-08-02T06:39:45Z" level=info msg="Missing pods for \"example-stan\" cluster (0/3), creating 3 pods..."
time="2018-08-02T06:39:46Z" level=info msg="Missing pods for \"example-stan\" cluster (1/3), creating 2 pods..."
time="2018-08-02T06:41:41Z" level=info msg="Missing pods for \"example-stan\" cluster (1/3), creating 2 pods..."
time="2018-08-02T06:42:51Z" level=info msg="Missing pods for \"example-stan\" cluster (2/3), creating 1 pods..."
time="2018-08-02T06:48:46Z" level=info msg="Too many pods for \"example-stan\" cluster (3/2), removing 1 pods..."
time="2018-08-02T06:49:12Z" level=info msg="Too many pods for \"example-stan\" cluster (2/1), removing 1 pods..."
time="2018-08-02T06:49:26Z" level=info msg="Missing pods for \"example-stan\" cluster (1/2), creating 1 pods..."
time="2018-08-02T06:50:09Z" level=info msg="Missing pods for \"example-stan\" cluster (2/3), creating 1 pods..."
time="2018-08-02T06:50:26Z" level=info msg="Missing pods for \"example-stan\" cluster (2/3), creating 1 pods..."
time="2018-08-02T06:52:46Z" level=info msg="Missing pods for \"example-stan\" cluster (0/3), creating 3 pods..."
time="2018-08-02T06:52:53Z" level=info msg="Missing pods for \"example-stan\" cluster (2/3), creating 1 pods..."
time="2018-08-02T06:54:01Z" level=info msg="Missing pods for \"example-stan\" cluster (2/3), creating 1 pods..."
describe
result:
Name: nats-streaming-operator-7698769f5f-psthg
Namespace: nats-io
Node: ip-10-250-81-178.ec2.internal/10.250.81.178
Start Time: Thu, 02 Aug 2018 06:33:47 +0000
Labels: name=nats-streaming-operator
pod-template-hash=3254325919
Annotations: cilium.io/identity=46229
Status: Running
IP: 10.235.70.74
Controlled By: ReplicaSet/nats-streaming-operator-7698769f5f
Containers:
nats-streaming-operator:
Container ID: docker://5d0c3ac2b35f23871394a7f3ad41661cce1ce2874d9c877dc218d9f8c6843121
Image: synadia/nats-streaming-operator:v0.1.0-v1alpha1
Image ID: docker-pullable://synadia/nats-streaming-operator@sha256:c514ca725db9a3f09b8a23b4d5ddcce298d8c81132088be8d1898d216442e93e
Port: <none>
State: Running
Started: Thu, 02 Aug 2018 06:33:49 +0000
Ready: True
Restart Count: 0
Environment:
MY_POD_NAMESPACE: nats-io (v1:metadata.namespace)
WATCH_NAMESPACE: nats-io (v1:metadata.namespace)
MY_POD_NAME: nats-streaming-operator-7698769f5f-psthg (v1:metadata.name)
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from nats-streaming-operator-token-669hd (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
nats-streaming-operator-token-669hd:
Type: Secret (a volume populated by a Secret)
SecretName: nats-streaming-operator-token-669hd
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: <none>
Events: <none>
kubernets: v1.9.5 nats operator: latest nats streaming operator: latest
after started nats cluster as the readme, everything seems fine, but when starting 3 stan pods using below command:
none of the 3 pods can connect to nats cluster, error message is like:
sometimes one pod has errors:
if I start stan pods one by one, sometimes the first can connect, sometimes two.