Open vishnuitta opened 4 years ago
Another observation: If I restart the nsmgr pod by deleting it (so that it comes up again), all three containers comes to running state, but, it gets into 'CrashLoopback' state after some time.
Below are the logs of nsmd-k8s container when it crashed first time:
time="2019-12-13T10:04:31Z" level=info msg="Starting nsmd-k8s..."
time="2019-12-13T10:04:31Z" level=info msg="Version: 75502fff"
time="2019-12-13T10:04:31Z" level=info msg="Creating logger from config: &{nsmd-k8s@nsmgr-7hnkk false false [] 0xc000213400 0xc0002356e0 <nil> <nil> <nil>}"
2019/12/13 10:04:31 Initializing logging reporter
time="2019-12-13T10:04:31Z" level=info msg="Starting NSMD Kubernetes on 0.0.0.0:5000 with NsmName ip-192-168-19-7.ap-south-1.compute.internal"
time="2019-12-13T10:04:31Z" level=info msg="GRPC.NewServer with open tracing enabled"
time="2019-12-13T10:04:31Z" level=info msg="Replacing Network services with: [{{NetworkService networkservicemesh.io/v1alpha1} {vl3-service default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservices/vl3-service 2b09ad3b-1b71-11ea-bd2e-0281e9d940ee 46526 1 2019-12-10 17:19:05 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {IP []} {}}]"
time="2019-12-13T10:04:31Z" level=info msg="Add from k8s-registry: *v1alpha1.NetworkService"
time="2019-12-13T10:04:31Z" level=info msg="Replacing Network service endpoints with: [{{NetworkServiceEndpoint networkservicemesh.io/v1alpha1} {vl3-servicez7c4w vl3-service default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkserviceendpoints/vl3-servicez7c4w 2b0b22db-1b71-11ea-bd2e-0281e9d940ee 46527 1 2019-12-10 17:19:05 +0000 UTC <nil> <nil> map[app:vl3-nse-ucnf networkservicename:vl3-service] map[] [] nil [] []} {vl3-service IP ip-192-168-19-7.ap-south-1.compute.internal} {RUNNING}}]"
time="2019-12-13T10:04:31Z" level=info msg="Add from k8s-registry: *v1alpha1.NetworkServiceEndpoint"
time="2019-12-13T10:04:31Z" level=info msg="Replacing Network service endpoints with: [{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}}]"
time="2019-12-13T10:04:31Z" level=info msg="NetworkServiceManagerCache.Added(&{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}})"
time="2019-12-13T10:04:31Z" level=info msg="RegistryCache started"
time="2019-12-13T10:04:31Z" level=info msg="nsmd-k8s initialized and waiting for connection"
time="2019-12-13T10:04:31Z" level=info msg="Start apply filter by namespace default for &{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}}"
time="2019-12-13T10:04:31Z" level=info msg="Add from k8s-registry: *v1alpha1.NetworkServiceManager"
time="2019-12-13T10:04:31Z" level=info msg="NetworkServiceManagerCache.Added(&{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}})"
time="2019-12-13T10:04:31Z" level=error msg="configmaps \"kubeadm-config\" not found"
time="2019-12-13T10:04:31Z" level=info msg="Start monitoring prefixes to exclude"
time="2019-12-13T10:04:31Z" level=info msg="GRPC.NewServer with open tracing enabled"
time="2019-12-13T10:04:31Z" level=info msg="Waiting for liveness probe: unix:/var/lib/networkservicemesh/plugins/registry.sock"
2019/12/13 10:04:31 Reporting span 4bd4d8aa00a613af:4bd4d8aa00a613af:0:1
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name vl3-mysql-master, subnet 10.100.49.144/32"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name kubernetes, subnet 10.100.0.1/32"
time="2019-12-13T10:04:31Z" level=info msg="Subnet extended from 10.100.49.144/32 to 10.100.0.0/18"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name kube-dns, subnet 10.100.0.10/32"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name nsm-admission-webhook-svc, subnet 10.100.70.222/32"
time="2019-12-13T10:04:31Z" level=info msg="Subnet extended from 10.100.0.0/18 to 10.100.0.0/17"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name nsmgr, subnet 10.100.48.148/32"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name pnsmgr-svc, subnet 10.100.126.223/32"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name jaeger, subnet 10.100.66.217/32"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name skydive-analyzer, subnet 10.100.129.110/32"
time="2019-12-13T10:04:31Z" level=info msg="Subnet extended from 10.100.0.0/17 to 10.100.0.0/16"
time="2019-12-13T10:04:31Z" level=info msg="Received RegisterNSM(url:\"192.168.16.12:30501\" )"
time="2019-12-13T10:04:31Z" level=info msg="CreateOrUpdateNSM attempt 0: "
time="2019-12-13T10:04:31Z" level=info msg="Updating existing NSM: &{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}} with &{{ } {ip-192-168-19-7.ap-south-1.compute.internal 0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 10:04:31.883287342 +0000 UTC m=+0.103448872 192.168.16.12:30501 RUNNING}}"
time="2019-12-13T10:04:31Z" level=info msg="Start apply filter by namespace default for &{{ } {ip-192-168-19-7.ap-south-1.compute.internal default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 405101 4 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 10:04:31 +0000 UTC 192.168.16.12:30501 RUNNING}}"
time="2019-12-13T10:04:31Z" level=info msg="Update from k8s-registry: *v1alpha1.NetworkServiceManager"
time="2019-12-13T10:04:31Z" level=info msg="Old NSM: &{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}}"
time="2019-12-13T10:04:31Z" level=info msg="New NSM: &{{ } {ip-192-168-19-7.ap-south-1.compute.internal default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 405101 4 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 10:04:31 +0000 UTC 192.168.16.12:30501 RUNNING}}"
time="2019-12-13T10:04:31Z" level=info msg="NetworkServiceManagerCache.resourceUpdated(&{{ } {ip-192-168-19-7.ap-south-1.compute.internal default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 405101 4 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 10:04:31 +0000 UTC 192.168.16.12:30501 RUNNING}})"
time="2019-12-13T10:04:31Z" level=info msg="NetworkServiceManagerCache.Update(&{{ } {ip-192-168-19-7.ap-south-1.compute.internal default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 405101 4 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 10:04:31 +0000 UTC 192.168.16.12:30501 RUNNING}})"
time="2019-12-13T10:04:31Z" level=info msg="RegisterNSM return: name:\"ip-192-168-19-7.ap-south-1.compute.internal\" url:\"192.168.16.12:30501\" last_seen:<seconds:1576231471 > state:\"RUNNING\" "
2019/12/13 10:04:31 Reporting span 15d4c6a2d8fcaf73:48201d993c2966ab:15d4c6a2d8fcaf73:1
time="2019-12-13T10:04:31Z" level=info msg="NetworkServiceManagerCache.resourceUpdated(&{{ } {ip-192-168-19-7.ap-south-1.compute.internal default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 405101 4 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 10:04:31 +0000 UTC 192.168.16.12:30501 RUNNING}})"
time="2019-12-13T10:04:31Z" level=info msg="Received GetEndpoints"
time="2019-12-13T10:04:31Z" level=info msg="GetEndpoints return: [name:\"vl3-servicez7c4w\" payload:\"IP\" network_service_name:\"vl3-service\" network_service_manager_name:\"ip-192-168-19-7.ap-south-1.compute.internal\" labels:<key:\"app\" value:\"vl3-nse-ucnf\" > labels:<key:\"networkservicename\" value:\"vl3-service\" > state:\"RUNNING\" ]"
2019/12/13 10:04:31 Reporting span 5ec8d4aed2bbc52b:4fa6ab861d89f0e5:5ec8d4aed2bbc52b:1
time="2019-12-13T10:04:31Z" level=info msg="NSE found 1, retrieve time: 8.937µs"
time="2019-12-13T10:04:31Z" level=info msg="FindNetworkService done: time 55.132µs [vl3-servicez7c4w]"
2019/12/13 10:04:31 Reporting span 3118b516b25ec572:31bbfa7901e0af07:3118b516b25ec572:1
time="2019-12-13T10:04:31Z" level=info msg="NSE found 1, retrieve time: 4.621µs"
time="2019-12-13T10:04:31Z" level=info msg="FindNetworkService done: time 108.987µs [vl3-servicez7c4w]"
2019/12/13 10:04:31 Reporting span 3c9cd2127d8bdb88:69f28a424f8cebee:3c9cd2127d8bdb88:1
2019/12/13 10:04:31 Reporting span 604771d6673a63a1:6af6c3790a8e4433:1ede413ff19c1e84:1
vitta@vitta-laptop:~/gocode/src/github.com/networkservicemesh/examples$ watch kubectl get pods -A --kubeconfig /home/vitta/eks_config
vitta@vitta-laptop:~/gocode/src/github.com/networkservicemesh/examples$ kubectl logs -f nsmgr-7hnkk -c nsmd-k8s -n nsm-system --kubeconfig /home/vitta/eks_config
time="2019-12-13T10:04:31Z" level=info msg="Starting nsmd-k8s..."
time="2019-12-13T10:04:31Z" level=info msg="Version: 75502fff"
time="2019-12-13T10:04:31Z" level=info msg="Creating logger from config: &{nsmd-k8s@nsmgr-7hnkk false false [] 0xc000213400 0xc0002356e0 <nil> <nil> <nil>}"
2019/12/13 10:04:31 Initializing logging reporter
time="2019-12-13T10:04:31Z" level=info msg="Starting NSMD Kubernetes on 0.0.0.0:5000 with NsmName ip-192-168-19-7.ap-south-1.compute.internal"
time="2019-12-13T10:04:31Z" level=info msg="GRPC.NewServer with open tracing enabled"
time="2019-12-13T10:04:31Z" level=info msg="Replacing Network services with: [{{NetworkService networkservicemesh.io/v1alpha1} {vl3-service default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservices/vl3-service 2b09ad3b-1b71-11ea-bd2e-0281e9d940ee 46526 1 2019-12-10 17:19:05 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {IP []} {}}]"
time="2019-12-13T10:04:31Z" level=info msg="Add from k8s-registry: *v1alpha1.NetworkService"
time="2019-12-13T10:04:31Z" level=info msg="Replacing Network service endpoints with: [{{NetworkServiceEndpoint networkservicemesh.io/v1alpha1} {vl3-servicez7c4w vl3-service default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkserviceendpoints/vl3-servicez7c4w 2b0b22db-1b71-11ea-bd2e-0281e9d940ee 46527 1 2019-12-10 17:19:05 +0000 UTC <nil> <nil> map[app:vl3-nse-ucnf networkservicename:vl3-service] map[] [] nil [] []} {vl3-service IP ip-192-168-19-7.ap-south-1.compute.internal} {RUNNING}}]"
time="2019-12-13T10:04:31Z" level=info msg="Add from k8s-registry: *v1alpha1.NetworkServiceEndpoint"
time="2019-12-13T10:04:31Z" level=info msg="Replacing Network service endpoints with: [{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}}]"
time="2019-12-13T10:04:31Z" level=info msg="NetworkServiceManagerCache.Added(&{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}})"
time="2019-12-13T10:04:31Z" level=info msg="RegistryCache started"
time="2019-12-13T10:04:31Z" level=info msg="nsmd-k8s initialized and waiting for connection"
time="2019-12-13T10:04:31Z" level=info msg="Start apply filter by namespace default for &{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}}"
time="2019-12-13T10:04:31Z" level=info msg="Add from k8s-registry: *v1alpha1.NetworkServiceManager"
time="2019-12-13T10:04:31Z" level=info msg="NetworkServiceManagerCache.Added(&{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}})"
time="2019-12-13T10:04:31Z" level=error msg="configmaps \"kubeadm-config\" not found"
time="2019-12-13T10:04:31Z" level=info msg="Start monitoring prefixes to exclude"
time="2019-12-13T10:04:31Z" level=info msg="GRPC.NewServer with open tracing enabled"
time="2019-12-13T10:04:31Z" level=info msg="Waiting for liveness probe: unix:/var/lib/networkservicemesh/plugins/registry.sock"
2019/12/13 10:04:31 Reporting span 4bd4d8aa00a613af:4bd4d8aa00a613af:0:1
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name vl3-mysql-master, subnet 10.100.49.144/32"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name kubernetes, subnet 10.100.0.1/32"
time="2019-12-13T10:04:31Z" level=info msg="Subnet extended from 10.100.49.144/32 to 10.100.0.0/18"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name kube-dns, subnet 10.100.0.10/32"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name nsm-admission-webhook-svc, subnet 10.100.70.222/32"
time="2019-12-13T10:04:31Z" level=info msg="Subnet extended from 10.100.0.0/18 to 10.100.0.0/17"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name nsmgr, subnet 10.100.48.148/32"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name pnsmgr-svc, subnet 10.100.126.223/32"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name jaeger, subnet 10.100.66.217/32"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name skydive-analyzer, subnet 10.100.129.110/32"
time="2019-12-13T10:04:31Z" level=info msg="Subnet extended from 10.100.0.0/17 to 10.100.0.0/16"
time="2019-12-13T10:04:31Z" level=info msg="Received RegisterNSM(url:\"192.168.16.12:30501\" )"
time="2019-12-13T10:04:31Z" level=info msg="CreateOrUpdateNSM attempt 0: "
time="2019-12-13T10:04:31Z" level=info msg="Updating existing NSM: &{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}} with &{{ } {ip-192-168-19-7.ap-south-1.compute.internal 0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 10:04:31.883287342 +0000 UTC m=+0.103448872 192.168.16.12:30501 RUNNING}}"
time="2019-12-13T10:04:31Z" level=info msg="Start apply filter by namespace default for &{{ } {ip-192-168-19-7.ap-south-1.compute.internal default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 405101 4 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 10:04:31 +0000 UTC 192.168.16.12:30501 RUNNING}}"
time="2019-12-13T10:04:31Z" level=info msg="Update from k8s-registry: *v1alpha1.NetworkServiceManager"
time="2019-12-13T10:04:31Z" level=info msg="Old NSM: &{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}}"
time="2019-12-13T10:04:31Z" level=info msg="New NSM: &{{ } {ip-192-168-19-7.ap-south-1.compute.internal default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 405101 4 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 10:04:31 +0000 UTC 192.168.16.12:30501 RUNNING}}"
time="2019-12-13T10:04:31Z" level=info msg="NetworkServiceManagerCache.resourceUpdated(&{{ } {ip-192-168-19-7.ap-south-1.compute.internal default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 405101 4 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 10:04:31 +0000 UTC 192.168.16.12:30501 RUNNING}})"
time="2019-12-13T10:04:31Z" level=info msg="NetworkServiceManagerCache.Update(&{{ } {ip-192-168-19-7.ap-south-1.compute.internal default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 405101 4 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 10:04:31 +0000 UTC 192.168.16.12:30501 RUNNING}})"
time="2019-12-13T10:04:31Z" level=info msg="RegisterNSM return: name:\"ip-192-168-19-7.ap-south-1.compute.internal\" url:\"192.168.16.12:30501\" last_seen:<seconds:1576231471 > state:\"RUNNING\" "
2019/12/13 10:04:31 Reporting span 15d4c6a2d8fcaf73:48201d993c2966ab:15d4c6a2d8fcaf73:1
time="2019-12-13T10:04:31Z" level=info msg="NetworkServiceManagerCache.resourceUpdated(&{{ } {ip-192-168-19-7.ap-south-1.compute.internal default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 405101 4 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil [] []} {} {2019-12-13 10:04:31 +0000 UTC 192.168.16.12:30501 RUNNING}})"
time="2019-12-13T10:04:31Z" level=info msg="Received GetEndpoints"
time="2019-12-13T10:04:31Z" level=info msg="GetEndpoints return: [name:\"vl3-servicez7c4w\" payload:\"IP\" network_service_name:\"vl3-service\" network_service_manager_name:\"ip-192-168-19-7.ap-south-1.compute.internal\" labels:<key:\"app\" value:\"vl3-nse-ucnf\" > labels:<key:\"networkservicename\" value:\"vl3-service\" > state:\"RUNNING\" ]"
2019/12/13 10:04:31 Reporting span 5ec8d4aed2bbc52b:4fa6ab861d89f0e5:5ec8d4aed2bbc52b:1
time="2019-12-13T10:04:31Z" level=info msg="NSE found 1, retrieve time: 8.937µs"
time="2019-12-13T10:04:31Z" level=info msg="FindNetworkService done: time 55.132µs [vl3-servicez7c4w]"
2019/12/13 10:04:31 Reporting span 3118b516b25ec572:31bbfa7901e0af07:3118b516b25ec572:1
time="2019-12-13T10:04:31Z" level=info msg="NSE found 1, retrieve time: 4.621µs"
time="2019-12-13T10:04:31Z" level=info msg="FindNetworkService done: time 108.987µs [vl3-servicez7c4w]"
2019/12/13 10:04:31 Reporting span 3c9cd2127d8bdb88:69f28a424f8cebee:3c9cd2127d8bdb88:1
2019/12/13 10:04:31 Reporting span 604771d6673a63a1:6af6c3790a8e4433:1ede413ff19c1e84:1
panic: close of closed channel
goroutine 86 [running]:
github.com/networkservicemesh/networkservicemesh/k8s/pkg/prefixcollector.watchSubnet.func1(0xc000089b00, 0x16aa200, 0xc00007bf80, 0xc0000ac480, 0xc00003bb40, 0xc00003bb30, 0xc0002439b0, 0xc0000aa2b8)
/root/networkservicemesh/k8s/pkg/prefixcollector/collector_server.go:181 +0x686
created by github.com/networkservicemesh/networkservicemesh/k8s/pkg/prefixcollector.watchSubnet
/root/networkservicemesh/k8s/pkg/prefixcollector/collector_server.go:173 +0xf6
rpc error: code = Unknown desc = Error: No such container: cbdd8b937becd3067404484b650c525bf939c83fedc51a34c14b1c8b8f2a3319
Hope this helps
any other steps I need to do to try out interdomain feature @tiswanso
Did you create the firewall rules for each public cloud cluster's VPC as described in this section: https://github.com/tiswanso/examples/tree/demo_vl3/examples/vl3_basic#public-cloud-setup ?
Regarding the nsmgr crashloop after a while ... I've also seen that but haven't done troubleshooting to get to the bottom of it. Been focusing on moving the virtual L3 to the latest master.
that makes sense to get the feature moving @tiswanso .. Regarding steps given at public-cloud-setup, I hadn't followed them. I will do that. This may be the reason why slave is NOT able to connect to master even after restarting nsmgr pod
getting this error while doing aws-start
:
$ make aws-start
~/gocode/src/github.com/networkservicemesh/networkservicemesh/scripts/aws ~/gocode/src/github.com/networkservicemesh/networkservicemesh
2019/12/14 05:48:13 Creating EKS service role "nsm-role"...
2019/12/14 05:48:15 Role "nsm-role"(arn:aws:iam::936055414837:role/nsm-role) successfully created!
2019/12/14 05:48:15 Creating Amazon EKS Cluster VPC "nsm-srv"...
2019/12/14 05:48:23 Error: Unexpected stack status: CREATE_FAILED
exit status 1
.mk/aws.mk:25: recipe for target 'aws-start' failed
make: *** [aws-start] Error 1
I will fix this and let you know.
BTW, few qns:
@tiswanso Created clusters as per steps given at public-cloud-setup. There is NO connectivity across pods. Few observations:
$ kubectl logs mysql-slave-b9744844f-dsjph --kubeconfig /home/vitta/.kube/gke-dev.kubeconfig -c nsm-init-container
time="2019-12-14T14:37:56Z" level=info msg="Starting nsm-init..."
time="2019-12-14T14:37:56Z" level=info msg="Version: 300a5cf3"
time="2019-12-14T14:37:56Z" level=info msg="nsmServerSocket: /var/lib/networkservicemesh/nsm.server.io.sock"
time="2019-12-14T14:37:56Z" level=info msg="nsmClientSocket: /var/lib/networkservicemesh/nsm.client.io.sock"
time="2019-12-14T14:37:56Z" level=info msg="workspace: /var/lib/networkservicemesh/"
time="2019-12-14T14:37:56Z" level=info msg="ADVERTISE_NSE_NAME not found."
time="2019-12-14T14:37:56Z" level=info msg="ADVERTISE_NSE_LABELS not found."
time="2019-12-14T14:37:56Z" level=info msg="OUTGOING_NSC_LABELS not found."
time="2019-12-14T14:37:56Z" level=info msg="TRACER_ENABLED not found."
time="2019-12-14T14:37:56Z" level=info msg="MECHANISM_TYPE not found."
time="2019-12-14T14:37:56Z" level=info msg="IP_ADDRESS not found."
time="2019-12-14T14:37:56Z" level=info msg="ROUTES not found."
time="2019-12-14T14:37:56Z" level=info msg="nsm: connection to nsm server on socket: /var/lib/networkservicemesh/nsm.server.io.sock succeeded."
time="2019-12-14T14:37:56Z" level=info msg="Initiating an outgoing connection." description="Primary interface" destEndpointManager= destEndpointName= mechanism=kernel mechanismName=nsm0 remoteIp=
time="2019-12-14T14:37:56Z" level=info msg="Selected mechanism: type:KERNEL_INTERFACE parameters:<key:\"description\" value:\"Primary interface\" > parameters:<key:\"name\" value:\"nsm0\" > parameters:<key:\"netnsInode\" value:\"4026532981\" > parameters:<key:\"socketfile\" value:\"nsm0/memif.sock\" > "
vl3-service
got created, but, it doesn't have any route
in matches
as below:
apiVersion: v1
items:
diff --git a/examples/vl3_basic/helm/mysql-master/values.yaml b/examples/vl3_basic/helm/mysql-master/values.yaml
index b8dc4b4..5c7a994 100644
--- a/examples/vl3_basic/helm/mysql-master/values.yaml
+++ b/examples/vl3_basic/helm/mysql-master/values.yaml
@@ -19,8 +19,8 @@ image:
tag: test
pullPolicy: IfNotPresent
-nameOverride: "" -fullnameOverride: "" +nameOverride: "vl3-mysql-master" +fullnameOverride: "vl3-mysql-master"
service: type: ClusterIP diff --git a/examples/vl3_basic/helm/mysql-slave/values.yaml b/examples/vl3_basic/helm/mysql-slave/values.yaml index aed6c80..f16a940 100644 --- a/examples/vl3_basic/helm/mysql-slave/values.yaml +++ b/examples/vl3_basic/helm/mysql-slave/values.yaml @@ -20,8 +20,8 @@ image: tag: test pullPolicy: IfNotPresent
-nameOverride: "" -fullnameOverride: "" +nameOverride: "mysql-slave" +fullnameOverride: "mysql-slave"
service: type: ClusterIP
Can you please provide next steps to try for pod-to-pod connectivity across clouds?
I will also document the issues that I faced during the make of setup and the reasons for same.
And one more:
I don't have any environment variables set for NSM_AWS_SERVICE_SUFFIX
during the make of setup. So, its empty.
Not able to understand how empty networkservice got created
ok.. got to see that pe
function itself doing the deployment of networkservice as well
@tiswanso Tried couple of times again. Almost same state, except that, mysql-slave
comes to Running
state, but, slave status shows as Connecting
mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Connecting to master
Master_Host: 172.31.239.1
Last_IO_Error: error connecting to master 'demo@172.31.239.1:3306' - retry-time: 60 retries: 144
Observations:
vl3-service
networkservice CR gets deployed as Empty. Re-applying the CR and restarting all NSM pods also didn't helped.
diff --git a/examples/vl3_basic/helm/mysql-master/values.yaml b/examples/vl3_basic/helm/mysql-master/values.yaml
index b8dc4b4..1790994 100644
--- a/examples/vl3_basic/helm/mysql-master/values.yaml
+++ b/examples/vl3_basic/helm/mysql-master/values.yaml
@@ -19,8 +19,8 @@ image:
tag: test
pullPolicy: IfNotPresent
-nameOverride: "" -fullnameOverride: "" +nameOverride: "mysql-master" +fullnameOverride: "vl3-mysql-master"
service: type: ClusterIP diff --git a/examples/vl3_basic/helm/mysql-slave/values.yaml b/examples/vl3_basic/helm/mysql-slave/values.yaml index aed6c80..06bd712 100644 --- a/examples/vl3_basic/helm/mysql-slave/values.yaml +++ b/examples/vl3_basic/helm/mysql-slave/values.yaml @@ -20,8 +20,8 @@ image: tag: test pullPolicy: IfNotPresent
-nameOverride: "" -fullnameOverride: "" +nameOverride: "mysql-slave" +fullnameOverride: "vl3-mysql-slave"
service: type: ClusterIP
- Restarting NSM and mysql-slave related pods made mysql-slave to `Init:0/1` state
This is related to issue at: https://github.com/networkservicemesh/networkservicemesh/issues/1972 Tried
interdomain
feature as per demo at NSMCon, nsmgr is in crashloopback statensm-system nsmgr-chcct 2/3 CrashLoopBackOff 208 18h
logs of nsmd-k8s container looks like:
Steps performed:
At this point, mysql installations failed probably due to variables in helm.. but, a container in nsmgr also in crashloopback state and its logs are as shown above.