tiswanso / examples

Network Service Mesh examples repo
Apache License 2.0
5 stars 5 forks source link

nsmgr pod in crashloopback state #1

Open vishnuitta opened 4 years ago

vishnuitta commented 4 years ago

This is related to issue at: https://github.com/networkservicemesh/networkservicemesh/issues/1972 Tried interdomain feature as per demo at NSMCon, nsmgr is in crashloopback state nsm-system nsmgr-chcct 2/3 CrashLoopBackOff 208 18h

logs of nsmd-k8s container looks like:

time="2019-12-11T14:24:08Z" level=info msg="Version: 75502fff"
time="2019-12-11T14:24:08Z" level=info msg="Creating logger from config: &{nsmd-k8s@nsmgr-chcct false false [] 0xc000388000 0xc00010e120 <nil> <nil> <nil>}"
2019/12/11 14:24:08 Initializing logging reporter
time="2019-12-11T14:24:08Z" level=info msg="Starting NSMD Kubernetes on 0.0.0.0:5000 with NsmName gke-vitta1-default-pool-7302dc57-81jg"
time="2019-12-11T14:24:08Z" level=info msg="GRPC.NewServer with open tracing enabled"
time="2019-12-11T14:24:08Z" level=info msg="Replacing Network services with: [{{NetworkService networkservicemesh.io/v1alpha1} {vl3-service  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservices/vl3-service 38162033-1b71-11ea-a92e-42010a800217 77050 1 2019-12-10 17:19:27 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {IP []} {}}]"
time="2019-12-11T14:24:08Z" level=info msg="Replacing Network service endpoints with: [{{NetworkServiceEndpoint networkservicemesh.io/v1alpha1} {vl3-service7vjlx vl3-service default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkserviceendpoints/vl3-service7vjlx 381bb7c9-1b71-11ea-a92e-42010a800217 77051 1 2019-12-10 17:19:27 +0000 UTC <nil> <nil> map[app:vl3-nse-ucnf networkservicename:vl3-service] map[] [] nil []  []} {vl3-service IP gke-vitta1-default-pool-7302dc57-81jg} {RUNNING}}]"
time="2019-12-11T14:24:08Z" level=info msg="Add from k8s-registry: *v1alpha1.NetworkService"
time="2019-12-11T14:24:08Z" level=info msg="Add from k8s-registry: *v1alpha1.NetworkServiceEndpoint"
time="2019-12-11T14:24:08Z" level=info msg="Replacing Network service endpoints with: [{{NetworkServiceManager networkservicemesh.io/v1alpha1} {gke-vitta1-default-pool-7302dc57-81jg  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/gke-vitta1-default-pool-7302dc57-81jg fa2c4f36-1b70-11ea-a92e-42010a800217 76692 1 2019-12-10 17:17:43 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-10 17:17:43 +0000 UTC 10.44.0.10:30501 RUNNING}}]"
time="2019-12-11T14:24:08Z" level=info msg="NetworkServiceManagerCache.Added(&{{NetworkServiceManager networkservicemesh.io/v1alpha1} {gke-vitta1-default-pool-7302dc57-81jg  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/gke-vitta1-default-pool-7302dc57-81jg fa2c4f36-1b70-11ea-a92e-42010a800217 76692 1 2019-12-10 17:17:43 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-10 17:17:43 +0000 UTC 10.44.0.10:30501 RUNNING}})"
time="2019-12-11T14:24:08Z" level=info msg="RegistryCache started"
time="2019-12-11T14:24:08Z" level=info msg="nsmd-k8s initialized and waiting for connection"
time="2019-12-11T14:24:08Z" level=error msg="configmaps \"kubeadm-config\" not found"
time="2019-12-11T14:24:08Z" level=info msg="Start monitoring prefixes to exclude"
time="2019-12-11T14:24:08Z" level=info msg="GRPC.NewServer with open tracing enabled"
time="2019-12-11T14:24:08Z" level=info msg="Waiting for liveness probe: unix:/var/lib/networkservicemesh/plugins/registry.sock"
time="2019-12-11T14:24:08Z" level=info msg="Start apply filter by namespace default for &{{NetworkServiceManager networkservicemesh.io/v1alpha1} {gke-vitta1-default-pool-7302dc57-81jg  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/gke-vitta1-default-pool-7302dc57-81jg fa2c4f36-1b70-11ea-a92e-42010a800217 76692 1 2019-12-10 17:17:43 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-10 17:17:43 +0000 UTC 10.44.0.10:30501 RUNNING}}"
time="2019-12-11T14:24:08Z" level=info msg="Add from k8s-registry: *v1alpha1.NetworkServiceManager"
time="2019-12-11T14:24:08Z" level=info msg="NetworkServiceManagerCache.Added(&{{NetworkServiceManager networkservicemesh.io/v1alpha1} {gke-vitta1-default-pool-7302dc57-81jg  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/gke-vitta1-default-pool-7302dc57-81jg fa2c4f36-1b70-11ea-a92e-42010a800217 76692 1 2019-12-10 17:17:43 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-10 17:17:43 +0000 UTC 10.44.0.10:30501 RUNNING}})"
2019/12/11 14:24:08 Reporting span 68eecd9a07b3fa91:68eecd9a07b3fa91:0:1
time="2019-12-11T14:24:08Z" level=fatal msg="Failed to start K8s Plugin rpc error: code = Unknown desc = already have a plugin with the same name"

Steps performed:

  1. Created single node cluster using GKE and EKS consoles
  2. Obtained kubeconfig of above clusters
  3. Followed step1 and 2 of https://github.com/tiswanso/examples/tree/demo_vl3/examples/vl3_basic (so checked out tiswanso/networkservicemesh and tiswanso/examples of branches specified in README.md)
  4. And then the steps at Mysql Demo example in same page

At this point, mysql installations failed probably due to variables in helm.. but, a container in nsmgr also in crashloopback state and its logs are as shown above.

vishnuitta commented 4 years ago

Another observation: If I restart the nsmgr pod by deleting it (so that it comes up again), all three containers comes to running state, but, it gets into 'CrashLoopback' state after some time.

vishnuitta commented 4 years ago

Below are the logs of nsmd-k8s container when it crashed first time:

time="2019-12-13T10:04:31Z" level=info msg="Starting nsmd-k8s..."
time="2019-12-13T10:04:31Z" level=info msg="Version: 75502fff"
time="2019-12-13T10:04:31Z" level=info msg="Creating logger from config: &{nsmd-k8s@nsmgr-7hnkk false false [] 0xc000213400 0xc0002356e0 <nil> <nil> <nil>}"
2019/12/13 10:04:31 Initializing logging reporter
time="2019-12-13T10:04:31Z" level=info msg="Starting NSMD Kubernetes on 0.0.0.0:5000 with NsmName ip-192-168-19-7.ap-south-1.compute.internal"
time="2019-12-13T10:04:31Z" level=info msg="GRPC.NewServer with open tracing enabled"
time="2019-12-13T10:04:31Z" level=info msg="Replacing Network services with: [{{NetworkService networkservicemesh.io/v1alpha1} {vl3-service  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservices/vl3-service 2b09ad3b-1b71-11ea-bd2e-0281e9d940ee 46526 1 2019-12-10 17:19:05 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {IP []} {}}]"
time="2019-12-13T10:04:31Z" level=info msg="Add from k8s-registry: *v1alpha1.NetworkService"
time="2019-12-13T10:04:31Z" level=info msg="Replacing Network service endpoints with: [{{NetworkServiceEndpoint networkservicemesh.io/v1alpha1} {vl3-servicez7c4w vl3-service default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkserviceendpoints/vl3-servicez7c4w 2b0b22db-1b71-11ea-bd2e-0281e9d940ee 46527 1 2019-12-10 17:19:05 +0000 UTC <nil> <nil> map[app:vl3-nse-ucnf networkservicename:vl3-service] map[] [] nil []  []} {vl3-service IP ip-192-168-19-7.ap-south-1.compute.internal} {RUNNING}}]"
time="2019-12-13T10:04:31Z" level=info msg="Add from k8s-registry: *v1alpha1.NetworkServiceEndpoint"
time="2019-12-13T10:04:31Z" level=info msg="Replacing Network service endpoints with: [{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}}]"
time="2019-12-13T10:04:31Z" level=info msg="NetworkServiceManagerCache.Added(&{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}})"
time="2019-12-13T10:04:31Z" level=info msg="RegistryCache started"
time="2019-12-13T10:04:31Z" level=info msg="nsmd-k8s initialized and waiting for connection"
time="2019-12-13T10:04:31Z" level=info msg="Start apply filter by namespace default for &{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}}"
time="2019-12-13T10:04:31Z" level=info msg="Add from k8s-registry: *v1alpha1.NetworkServiceManager"
time="2019-12-13T10:04:31Z" level=info msg="NetworkServiceManagerCache.Added(&{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}})"
time="2019-12-13T10:04:31Z" level=error msg="configmaps \"kubeadm-config\" not found"
time="2019-12-13T10:04:31Z" level=info msg="Start monitoring prefixes to exclude"
time="2019-12-13T10:04:31Z" level=info msg="GRPC.NewServer with open tracing enabled"
time="2019-12-13T10:04:31Z" level=info msg="Waiting for liveness probe: unix:/var/lib/networkservicemesh/plugins/registry.sock"
2019/12/13 10:04:31 Reporting span 4bd4d8aa00a613af:4bd4d8aa00a613af:0:1
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name vl3-mysql-master, subnet 10.100.49.144/32"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name kubernetes, subnet 10.100.0.1/32"
time="2019-12-13T10:04:31Z" level=info msg="Subnet extended from 10.100.49.144/32 to 10.100.0.0/18"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name kube-dns, subnet 10.100.0.10/32"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name nsm-admission-webhook-svc, subnet 10.100.70.222/32"
time="2019-12-13T10:04:31Z" level=info msg="Subnet extended from 10.100.0.0/18 to 10.100.0.0/17"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name nsmgr, subnet 10.100.48.148/32"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name pnsmgr-svc, subnet 10.100.126.223/32"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name jaeger, subnet 10.100.66.217/32"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name skydive-analyzer, subnet 10.100.129.110/32"
time="2019-12-13T10:04:31Z" level=info msg="Subnet extended from 10.100.0.0/17 to 10.100.0.0/16"
time="2019-12-13T10:04:31Z" level=info msg="Received RegisterNSM(url:\"192.168.16.12:30501\" )"
time="2019-12-13T10:04:31Z" level=info msg="CreateOrUpdateNSM attempt 0: "
time="2019-12-13T10:04:31Z" level=info msg="Updating existing NSM: &{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}} with &{{ } {ip-192-168-19-7.ap-south-1.compute.internal      0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 10:04:31.883287342 +0000 UTC m=+0.103448872 192.168.16.12:30501 RUNNING}}"
time="2019-12-13T10:04:31Z" level=info msg="Start apply filter by namespace default for &{{ } {ip-192-168-19-7.ap-south-1.compute.internal  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 405101 4 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 10:04:31 +0000 UTC 192.168.16.12:30501 RUNNING}}"
time="2019-12-13T10:04:31Z" level=info msg="Update from k8s-registry: *v1alpha1.NetworkServiceManager"
time="2019-12-13T10:04:31Z" level=info msg="Old NSM: &{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}}"
time="2019-12-13T10:04:31Z" level=info msg="New NSM: &{{ } {ip-192-168-19-7.ap-south-1.compute.internal  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 405101 4 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 10:04:31 +0000 UTC 192.168.16.12:30501 RUNNING}}"
time="2019-12-13T10:04:31Z" level=info msg="NetworkServiceManagerCache.resourceUpdated(&{{ } {ip-192-168-19-7.ap-south-1.compute.internal  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 405101 4 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 10:04:31 +0000 UTC 192.168.16.12:30501 RUNNING}})"
time="2019-12-13T10:04:31Z" level=info msg="NetworkServiceManagerCache.Update(&{{ } {ip-192-168-19-7.ap-south-1.compute.internal  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 405101 4 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 10:04:31 +0000 UTC 192.168.16.12:30501 RUNNING}})"
time="2019-12-13T10:04:31Z" level=info msg="RegisterNSM return: name:\"ip-192-168-19-7.ap-south-1.compute.internal\" url:\"192.168.16.12:30501\" last_seen:<seconds:1576231471 > state:\"RUNNING\" "
2019/12/13 10:04:31 Reporting span 15d4c6a2d8fcaf73:48201d993c2966ab:15d4c6a2d8fcaf73:1
time="2019-12-13T10:04:31Z" level=info msg="NetworkServiceManagerCache.resourceUpdated(&{{ } {ip-192-168-19-7.ap-south-1.compute.internal  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 405101 4 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 10:04:31 +0000 UTC 192.168.16.12:30501 RUNNING}})"
time="2019-12-13T10:04:31Z" level=info msg="Received GetEndpoints"
time="2019-12-13T10:04:31Z" level=info msg="GetEndpoints return: [name:\"vl3-servicez7c4w\" payload:\"IP\" network_service_name:\"vl3-service\" network_service_manager_name:\"ip-192-168-19-7.ap-south-1.compute.internal\" labels:<key:\"app\" value:\"vl3-nse-ucnf\" > labels:<key:\"networkservicename\" value:\"vl3-service\" > state:\"RUNNING\" ]"
2019/12/13 10:04:31 Reporting span 5ec8d4aed2bbc52b:4fa6ab861d89f0e5:5ec8d4aed2bbc52b:1
time="2019-12-13T10:04:31Z" level=info msg="NSE found 1, retrieve time: 8.937µs"
time="2019-12-13T10:04:31Z" level=info msg="FindNetworkService done: time 55.132µs [vl3-servicez7c4w]"
2019/12/13 10:04:31 Reporting span 3118b516b25ec572:31bbfa7901e0af07:3118b516b25ec572:1
time="2019-12-13T10:04:31Z" level=info msg="NSE found 1, retrieve time: 4.621µs"
time="2019-12-13T10:04:31Z" level=info msg="FindNetworkService done: time 108.987µs [vl3-servicez7c4w]"
2019/12/13 10:04:31 Reporting span 3c9cd2127d8bdb88:69f28a424f8cebee:3c9cd2127d8bdb88:1
2019/12/13 10:04:31 Reporting span 604771d6673a63a1:6af6c3790a8e4433:1ede413ff19c1e84:1
vitta@vitta-laptop:~/gocode/src/github.com/networkservicemesh/examples$ watch kubectl get pods -A --kubeconfig /home/vitta/eks_config 
vitta@vitta-laptop:~/gocode/src/github.com/networkservicemesh/examples$ kubectl logs -f nsmgr-7hnkk -c nsmd-k8s -n nsm-system --kubeconfig /home/vitta/eks_config 
time="2019-12-13T10:04:31Z" level=info msg="Starting nsmd-k8s..."
time="2019-12-13T10:04:31Z" level=info msg="Version: 75502fff"
time="2019-12-13T10:04:31Z" level=info msg="Creating logger from config: &{nsmd-k8s@nsmgr-7hnkk false false [] 0xc000213400 0xc0002356e0 <nil> <nil> <nil>}"
2019/12/13 10:04:31 Initializing logging reporter
time="2019-12-13T10:04:31Z" level=info msg="Starting NSMD Kubernetes on 0.0.0.0:5000 with NsmName ip-192-168-19-7.ap-south-1.compute.internal"
time="2019-12-13T10:04:31Z" level=info msg="GRPC.NewServer with open tracing enabled"
time="2019-12-13T10:04:31Z" level=info msg="Replacing Network services with: [{{NetworkService networkservicemesh.io/v1alpha1} {vl3-service  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservices/vl3-service 2b09ad3b-1b71-11ea-bd2e-0281e9d940ee 46526 1 2019-12-10 17:19:05 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {IP []} {}}]"
time="2019-12-13T10:04:31Z" level=info msg="Add from k8s-registry: *v1alpha1.NetworkService"
time="2019-12-13T10:04:31Z" level=info msg="Replacing Network service endpoints with: [{{NetworkServiceEndpoint networkservicemesh.io/v1alpha1} {vl3-servicez7c4w vl3-service default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkserviceendpoints/vl3-servicez7c4w 2b0b22db-1b71-11ea-bd2e-0281e9d940ee 46527 1 2019-12-10 17:19:05 +0000 UTC <nil> <nil> map[app:vl3-nse-ucnf networkservicename:vl3-service] map[] [] nil []  []} {vl3-service IP ip-192-168-19-7.ap-south-1.compute.internal} {RUNNING}}]"
time="2019-12-13T10:04:31Z" level=info msg="Add from k8s-registry: *v1alpha1.NetworkServiceEndpoint"
time="2019-12-13T10:04:31Z" level=info msg="Replacing Network service endpoints with: [{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}}]"
time="2019-12-13T10:04:31Z" level=info msg="NetworkServiceManagerCache.Added(&{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}})"
time="2019-12-13T10:04:31Z" level=info msg="RegistryCache started"
time="2019-12-13T10:04:31Z" level=info msg="nsmd-k8s initialized and waiting for connection"
time="2019-12-13T10:04:31Z" level=info msg="Start apply filter by namespace default for &{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}}"
time="2019-12-13T10:04:31Z" level=info msg="Add from k8s-registry: *v1alpha1.NetworkServiceManager"
time="2019-12-13T10:04:31Z" level=info msg="NetworkServiceManagerCache.Added(&{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}})"
time="2019-12-13T10:04:31Z" level=error msg="configmaps \"kubeadm-config\" not found"
time="2019-12-13T10:04:31Z" level=info msg="Start monitoring prefixes to exclude"
time="2019-12-13T10:04:31Z" level=info msg="GRPC.NewServer with open tracing enabled"
time="2019-12-13T10:04:31Z" level=info msg="Waiting for liveness probe: unix:/var/lib/networkservicemesh/plugins/registry.sock"
2019/12/13 10:04:31 Reporting span 4bd4d8aa00a613af:4bd4d8aa00a613af:0:1
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name vl3-mysql-master, subnet 10.100.49.144/32"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name kubernetes, subnet 10.100.0.1/32"
time="2019-12-13T10:04:31Z" level=info msg="Subnet extended from 10.100.49.144/32 to 10.100.0.0/18"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name kube-dns, subnet 10.100.0.10/32"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name nsm-admission-webhook-svc, subnet 10.100.70.222/32"
time="2019-12-13T10:04:31Z" level=info msg="Subnet extended from 10.100.0.0/18 to 10.100.0.0/17"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name nsmgr, subnet 10.100.48.148/32"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name pnsmgr-svc, subnet 10.100.126.223/32"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name jaeger, subnet 10.100.66.217/32"
time="2019-12-13T10:04:31Z" level=info msg="Receive resource: name skydive-analyzer, subnet 10.100.129.110/32"
time="2019-12-13T10:04:31Z" level=info msg="Subnet extended from 10.100.0.0/17 to 10.100.0.0/16"
time="2019-12-13T10:04:31Z" level=info msg="Received RegisterNSM(url:\"192.168.16.12:30501\" )"
time="2019-12-13T10:04:31Z" level=info msg="CreateOrUpdateNSM attempt 0: "
time="2019-12-13T10:04:31Z" level=info msg="Updating existing NSM: &{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}} with &{{ } {ip-192-168-19-7.ap-south-1.compute.internal      0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 10:04:31.883287342 +0000 UTC m=+0.103448872 192.168.16.12:30501 RUNNING}}"
time="2019-12-13T10:04:31Z" level=info msg="Start apply filter by namespace default for &{{ } {ip-192-168-19-7.ap-south-1.compute.internal  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 405101 4 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 10:04:31 +0000 UTC 192.168.16.12:30501 RUNNING}}"
time="2019-12-13T10:04:31Z" level=info msg="Update from k8s-registry: *v1alpha1.NetworkServiceManager"
time="2019-12-13T10:04:31Z" level=info msg="Old NSM: &{{NetworkServiceManager networkservicemesh.io/v1alpha1} {ip-192-168-19-7.ap-south-1.compute.internal  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 353903 3 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 00:48:36 +0000 UTC 192.168.17.168:30501 RUNNING}}"
time="2019-12-13T10:04:31Z" level=info msg="New NSM: &{{ } {ip-192-168-19-7.ap-south-1.compute.internal  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 405101 4 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 10:04:31 +0000 UTC 192.168.16.12:30501 RUNNING}}"
time="2019-12-13T10:04:31Z" level=info msg="NetworkServiceManagerCache.resourceUpdated(&{{ } {ip-192-168-19-7.ap-south-1.compute.internal  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 405101 4 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 10:04:31 +0000 UTC 192.168.16.12:30501 RUNNING}})"
time="2019-12-13T10:04:31Z" level=info msg="NetworkServiceManagerCache.Update(&{{ } {ip-192-168-19-7.ap-south-1.compute.internal  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 405101 4 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 10:04:31 +0000 UTC 192.168.16.12:30501 RUNNING}})"
time="2019-12-13T10:04:31Z" level=info msg="RegisterNSM return: name:\"ip-192-168-19-7.ap-south-1.compute.internal\" url:\"192.168.16.12:30501\" last_seen:<seconds:1576231471 > state:\"RUNNING\" "
2019/12/13 10:04:31 Reporting span 15d4c6a2d8fcaf73:48201d993c2966ab:15d4c6a2d8fcaf73:1
time="2019-12-13T10:04:31Z" level=info msg="NetworkServiceManagerCache.resourceUpdated(&{{ } {ip-192-168-19-7.ap-south-1.compute.internal  default /apis/networkservicemesh.io/v1alpha1/namespaces/default/networkservicemanagers/ip-192-168-19-7.ap-south-1.compute.internal 16eb6b04-1b71-11ea-bd2e-0281e9d940ee 405101 4 2019-12-10 17:18:31 +0000 UTC <nil> <nil> map[] map[] [] nil []  []} {} {2019-12-13 10:04:31 +0000 UTC 192.168.16.12:30501 RUNNING}})"
time="2019-12-13T10:04:31Z" level=info msg="Received GetEndpoints"
time="2019-12-13T10:04:31Z" level=info msg="GetEndpoints return: [name:\"vl3-servicez7c4w\" payload:\"IP\" network_service_name:\"vl3-service\" network_service_manager_name:\"ip-192-168-19-7.ap-south-1.compute.internal\" labels:<key:\"app\" value:\"vl3-nse-ucnf\" > labels:<key:\"networkservicename\" value:\"vl3-service\" > state:\"RUNNING\" ]"
2019/12/13 10:04:31 Reporting span 5ec8d4aed2bbc52b:4fa6ab861d89f0e5:5ec8d4aed2bbc52b:1
time="2019-12-13T10:04:31Z" level=info msg="NSE found 1, retrieve time: 8.937µs"
time="2019-12-13T10:04:31Z" level=info msg="FindNetworkService done: time 55.132µs [vl3-servicez7c4w]"
2019/12/13 10:04:31 Reporting span 3118b516b25ec572:31bbfa7901e0af07:3118b516b25ec572:1
time="2019-12-13T10:04:31Z" level=info msg="NSE found 1, retrieve time: 4.621µs"
time="2019-12-13T10:04:31Z" level=info msg="FindNetworkService done: time 108.987µs [vl3-servicez7c4w]"
2019/12/13 10:04:31 Reporting span 3c9cd2127d8bdb88:69f28a424f8cebee:3c9cd2127d8bdb88:1
2019/12/13 10:04:31 Reporting span 604771d6673a63a1:6af6c3790a8e4433:1ede413ff19c1e84:1
panic: close of closed channel

goroutine 86 [running]:
github.com/networkservicemesh/networkservicemesh/k8s/pkg/prefixcollector.watchSubnet.func1(0xc000089b00, 0x16aa200, 0xc00007bf80, 0xc0000ac480, 0xc00003bb40, 0xc00003bb30, 0xc0002439b0, 0xc0000aa2b8)
    /root/networkservicemesh/k8s/pkg/prefixcollector/collector_server.go:181 +0x686
created by github.com/networkservicemesh/networkservicemesh/k8s/pkg/prefixcollector.watchSubnet
    /root/networkservicemesh/k8s/pkg/prefixcollector/collector_server.go:173 +0xf6
rpc error: code = Unknown desc = Error: No such container: cbdd8b937becd3067404484b650c525bf939c83fedc51a34c14b1c8b8f2a3319

Hope this helps

vishnuitta commented 4 years ago

any other steps I need to do to try out interdomain feature @tiswanso

tiswanso commented 4 years ago

Did you create the firewall rules for each public cloud cluster's VPC as described in this section: https://github.com/tiswanso/examples/tree/demo_vl3/examples/vl3_basic#public-cloud-setup ?

Regarding the nsmgr crashloop after a while ... I've also seen that but haven't done troubleshooting to get to the bottom of it. Been focusing on moving the virtual L3 to the latest master.

vishnuitta commented 4 years ago

that makes sense to get the feature moving @tiswanso .. Regarding steps given at public-cloud-setup, I hadn't followed them. I will do that. This may be the reason why slave is NOT able to connect to master even after restarting nsmgr pod

vishnuitta commented 4 years ago

getting this error while doing aws-start:

$ make aws-start 
~/gocode/src/github.com/networkservicemesh/networkservicemesh/scripts/aws ~/gocode/src/github.com/networkservicemesh/networkservicemesh
2019/12/14 05:48:13 Creating EKS service role "nsm-role"...
2019/12/14 05:48:15 Role "nsm-role"(arn:aws:iam::936055414837:role/nsm-role) successfully created!
2019/12/14 05:48:15 Creating Amazon EKS Cluster VPC "nsm-srv"...
2019/12/14 05:48:23 Error: Unexpected stack status: CREATE_FAILED
exit status 1
.mk/aws.mk:25: recipe for target 'aws-start' failed
make: *** [aws-start] Error 1

I will fix this and let you know.

BTW, few qns:

vishnuitta commented 4 years ago

@tiswanso Created clusters as per steps given at public-cloud-setup. There is NO connectivity across pods. Few observations:

-nameOverride: "" -fullnameOverride: "" +nameOverride: "vl3-mysql-master" +fullnameOverride: "vl3-mysql-master"

service: type: ClusterIP diff --git a/examples/vl3_basic/helm/mysql-slave/values.yaml b/examples/vl3_basic/helm/mysql-slave/values.yaml index aed6c80..f16a940 100644 --- a/examples/vl3_basic/helm/mysql-slave/values.yaml +++ b/examples/vl3_basic/helm/mysql-slave/values.yaml @@ -20,8 +20,8 @@ image: tag: test pullPolicy: IfNotPresent

-nameOverride: "" -fullnameOverride: "" +nameOverride: "mysql-slave" +fullnameOverride: "mysql-slave"

service: type: ClusterIP



Can you please provide next steps to try for pod-to-pod connectivity across clouds?

I will also document the issues that I faced during the make of setup and the reasons for same.
vishnuitta commented 4 years ago

And one more: I don't have any environment variables set for NSM_AWS_SERVICE_SUFFIX during the make of setup. So, its empty.

Not able to understand how empty networkservice got created

vishnuitta commented 4 years ago

ok.. got to see that pe function itself doing the deployment of networkservice as well

vishnuitta commented 4 years ago

@tiswanso Tried couple of times again. Almost same state, except that, mysql-slave comes to Running state, but, slave status shows as Connecting

mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Connecting to master
                  Master_Host: 172.31.239.1
                Last_IO_Error: error connecting to master 'demo@172.31.239.1:3306' - retry-time: 60  retries: 144

Observations:

-nameOverride: "" -fullnameOverride: "" +nameOverride: "mysql-master" +fullnameOverride: "vl3-mysql-master"

service: type: ClusterIP diff --git a/examples/vl3_basic/helm/mysql-slave/values.yaml b/examples/vl3_basic/helm/mysql-slave/values.yaml index aed6c80..06bd712 100644 --- a/examples/vl3_basic/helm/mysql-slave/values.yaml +++ b/examples/vl3_basic/helm/mysql-slave/values.yaml @@ -20,8 +20,8 @@ image: tag: test pullPolicy: IfNotPresent

-nameOverride: "" -fullnameOverride: "" +nameOverride: "mysql-slave" +fullnameOverride: "vl3-mysql-slave"

service: type: ClusterIP


- Restarting NSM and mysql-slave related pods made mysql-slave to `Init:0/1` state