Closed pavanats closed 4 years ago
Hi, Can you please describe steps you took while this error occurred? Was this while building sample app? Thanks
Hi, Can you please describe steps you took while this error occurred? Was this while building sample app? Thanks
Yes this was while deploying the sample app. I followed the instructions provided on the openness site.
I will try to reproduce this bug and I will get back to you.
Hi, I wasn't able to reproduce error you describe. Are you following this guide?
Hi, Yes, I have been following the link you shared. I was able to move forward and got to the point to deploy the sampleApp. Now my issue is that producer and consumer pods are in pending state. On further investigation, I see the edge node not ready. I am still debugging this issue. Let know if you know any steps I can try to make the edge node ready.
I have already tried restarting the controller and edge nodes, but to no avail. Pavan
From: tomaszwesolowski notifications@github.com Sent: Monday, July 20, 2020 6:33 PM To: open-ness/openness-experience-kits openness-experience-kits@noreply.github.com Cc: Pavan Gupta pavan.gupta@atsgen.com; Author author@noreply.github.com Subject: Re: [open-ness/openness-experience-kits] Error in deploying SampleApp (#37)
Hi, I wasn't able to reproduce error you describe. Are you following thishttps://github.com/open-ness/specs/blob/master/doc/applications-onboard/network-edge-applications-onboarding.md#deploying-consumer-and-producer-sample-application guide?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/open-ness/openness-experience-kits/issues/37#issuecomment-661025471, or unsubscribehttps://github.com/notifications/unsubscribe-auth/APSLZCZAW7FWFLGMQZQZ5WTR4Q6AZANCNFSM4O6NCLQA.
Can you provide output from command kubectl describe node node_name
?
Also you can delete the node from cluster and redeploy it with ./deploy_ne.sh node
.
Hi, Here's the output of 'kubectl describe node node01':
[root@controller ~]# kubectl describe node node01
Name: node01
Roles: worker
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
feature.node.kubernetes.io/cpu-cpuid.AESNI=true
feature.node.kubernetes.io/cpu-cpuid.AVX=true
feature.node.kubernetes.io/cpu-cpuid.IBPB=true
feature.node.kubernetes.io/cpu-cpuid.STIBP=true
feature.node.kubernetes.io/cpu-cpuid.VMX=true
feature.node.kubernetes.io/cpu-hardware_multithreading=true
feature.node.kubernetes.io/iommu-enabled=true
feature.node.kubernetes.io/kernel-config.NO_HZ=true
feature.node.kubernetes.io/kernel-config.NO_HZ_FULL=true
feature.node.kubernetes.io/kernel-config.PREEMPT=true
feature.node.kubernetes.io/kernel-version.full=3.10.0-1062.12.1.rt56.1042.el7.x86_64
feature.node.kubernetes.io/kernel-version.major=3
feature.node.kubernetes.io/kernel-version.minor=10
feature.node.kubernetes.io/kernel-version.revision=0
feature.node.kubernetes.io/memory-numa=true
feature.node.kubernetes.io/network-sriov.capable=true
feature.node.kubernetes.io/pci-0300_102b.present=true
feature.node.kubernetes.io/system-os_release.ID=centos
feature.node.kubernetes.io/system-os_release.VERSION_ID=7
feature.node.kubernetes.io/system-os_release.VERSION_ID.major=7
feature.node.kubernetes.io/system-os_release.VERSION_ID.minor=
kubernetes.io/arch=amd64
kubernetes.io/hostname=node01
kubernetes.io/os=linux
kubevirt.io/schedulable=true
node-role.kubernetes.io/worker=worker
Annotations: node.alpha.kubernetes.io/ttl: 0
ovn.kubernetes.io/cidr: 100.64.0.0/16
ovn.kubernetes.io/gateway: 100.64.0.1
ovn.kubernetes.io/ip_address: 100.64.0.3
ovn.kubernetes.io/logical_switch: join
ovn.kubernetes.io/mac_address: de:fd:f8:40:00:04
ovn.kubernetes.io/port_name: node-node01
CreationTimestamp: Thu, 16 Jul 2020 13:27:22 +0200
Taints: node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unreachable:NoSchedule
Unschedulable: false
Lease:
HolderIdentity: node01
AcquireTime:
MemoryPressure Unknown Fri, 17 Jul 2020 17:48:06 +0200 Fri, 17 Jul 2020 17:48:46 +0200 NodeStatusUnknown Kubelet stopped posting node status. DiskPressure Unknown Fri, 17 Jul 2020 17:48:06 +0200 Fri, 17 Jul 2020 17:48:46 +0200 NodeStatusUnknown Kubelet stopped posting node status. PIDPressure Unknown Fri, 17 Jul 2020 17:48:06 +0200 Fri, 17 Jul 2020 17:48:46 +0200 NodeStatusUnknown Kubelet stopped posting node status. Ready Unknown Fri, 17 Jul 2020 17:48:06 +0200 Fri, 17 Jul 2020 17:48:46 +0200 NodeStatusUnknown Kubelet stopped posting node status. Addresses: InternalIP: 134.119.205.185 Hostname: node01 Capacity: cpu: 48 devices.kubevirt.io/kvm: 110 devices.kubevirt.io/tun: 110 devices.kubevirt.io/vhost-net: 110 ephemeral-storage: 51175Mi hugepages-1Gi: 0 hugepages-2Mi: 4Gi memory: 131886780Ki pods: 110 Allocatable: cpu: 47 devices.kubevirt.io/kvm: 110 devices.kubevirt.io/tun: 110 devices.kubevirt.io/vhost-net: 110 ephemeral-storage: 48294789041 hugepages-1Gi: 0 hugepages-2Mi: 4Gi memory: 127590076Ki pods: 110 System Info: Machine ID: 238da3fd0f5b4a968758a13684c78869 System UUID: 00000000-0000-0000-0000-002590F53B8A Boot ID: 7090118d-2c09-4262-9674-48cb8fe941b9 Kernel Version: 3.10.0-1062.12.1.rt56.1042.el7.x86_64 OS Image: CentOS Linux 7 (Core) Operating System: linux Architecture: amd64 Container Runtime Version: docker://Unknown Kubelet Version: v1.18.4 Kube-Proxy Version: v1.18.4 Non-terminated Pods: (26 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
cdi cdi-apiserver-885758cc4-cr4ss 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d3h cdi cdi-deployment-5bdcc85d54-4ns74 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d3h cdi cdi-operator-76b6694845-zl925 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d6h cdi cdi-uploadproxy-89cf96777-qbfps 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d3h kube-system descheduler-cronjob-1594999440-s7q6h 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d kube-system kube-ovn-cni-7sms5 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d4h kube-system kube-ovn-controller-96f89c68b-jbnxj 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d3h kube-system kube-proxy-m9d8g 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d4h kube-system ovs-ovn-vbj99 200m (0%) 1 (2%) 1Gi (0%) 1Gi (0%) 4d4h kubevirt virt-api-f94f8b959-2cjrk 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d3h kubevirt virt-api-f94f8b959-g42xv 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d3h kubevirt virt-controller-64766f7cbf-2xztk 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d3h kubevirt virt-controller-64766f7cbf-k8t4c 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d3h kubevirt virt-handler-6wcvn 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d3h kubevirt virt-operator-79c97797-jwdtf 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d6h kubevirt virt-operator-79c97797-wxsxj 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d6h openness eaa-6f8b94c9d7-nvbbs 100m (0%) 1 (2%) 128Mi (0%) 128Mi (0%) 4d6h openness edgedns-kzslw 100m (0%) 1 (2%) 128Mi (0%) 128Mi (0%) 4d3h openness interfaceservice-r42pj 100m (0%) 1 (2%) 128Mi (0%) 128Mi (0%) 4d3h openness nfd-release-node-feature-discovery-worker-bnx9g 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d3h openness syslog-ng-l289n 100m (0%) 500m (1%) 128Mi (0%) 128Mi (0%) 4d3h telemetry cadvisor-bp5ns 100m (0%) 1 (2%) 2Gi (1%) 2Gi (1%) 4d3h telemetry collectd-dwttf 100m (0%) 1 (2%) 2Gi (1%) 2Gi (1%) 4d3h telemetry otel-collector-7d5b75bbdf-8btkm 200m (0%) 1 (2%) 400Mi (0%) 2Gi (1%) 4d6h telemetry prometheus-node-exporter-fcgnh 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d3h telemetry telemetry-node-certs-qqdkh 100m (0%) 100m (0%) 128Mi (0%) 128Mi (0%) 4d3h Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits
cpu 1100m (2%) 7600m (16%)
memory 6160Mi (4%) 7808Mi (6%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 1Gi (25%) 1Gi (25%)
devices.kubevirt.io/kvm 0 0
devices.kubevirt.io/tun 0 0
devices.kubevirt.io/vhost-net 0 0
Events:
From: tomaszwesolowski notifications@github.com Sent: Monday, July 20, 2020 7:26 PM To: open-ness/openness-experience-kits openness-experience-kits@noreply.github.com Cc: Pavan Gupta pavan.gupta@atsgen.com; Author author@noreply.github.com Subject: Re: [open-ness/openness-experience-kits] Error in deploying SampleApp (#37)
Can you provide output from command kubectl describe node node_name? Also you can delete the node from cluster and redeploy it with ./deploy_ne.sh node.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/open-ness/openness-experience-kits/issues/37#issuecomment-661055369, or unsubscribehttps://github.com/notifications/unsubscribe-auth/APSLZC7GOTUYYRE3KS255MDR4REHFANCNFSM4O6NCLQA.
Everything looks fine here. Did you try delete node and deploy it again?
Hi, I am in process of doing that. Will update you once done. Pavan
From: tomaszwesolowski notifications@github.com Sent: Tuesday, July 21, 2020 12:41 PM To: open-ness/openness-experience-kits openness-experience-kits@noreply.github.com Cc: Pavan Gupta pavan.gupta@atsgen.com; Author author@noreply.github.com Subject: Re: [open-ness/openness-experience-kits] Error in deploying SampleApp (#37)
Everything looks fine here. Did you try delete node and deploy it again?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/open-ness/openness-experience-kits/issues/37#issuecomment-661678712, or unsubscribehttps://github.com/notifications/unsubscribe-auth/APSLZC5WIYDBNEQCD2WW7ELR4U5RPANCNFSM4O6NCLQA.
Hi, I could finally manage to deploy both the controller and the network edge node on physical machines. For the network edge, I had to make few retries and came across following issues. In general, I think it will help to mention steps that users can take when they come across common failure issues.
"ansible_loop_var": "item",
"changed": false,
"cmd": "set -o pipefail && kubectl logs -n kube-system $(kubectl get pods -n kube-system -o custom-columns=NAME:.metadata.name | grep ovn-central)\n",
"delta": "0:00:00.225693",
"end": "2020-07-21 10:46:18.626494",
"item": "ovn-central",
"rc": 1,
"start": "2020-07-21 10:46:18.400801"
}
STDERR:
Error from server (BadRequest): container "ovn-central" in pod "ovn-central-74986486f9-fvq5z" is waiting to start: ContainerCreating
================================================================================
"ansible_loop_var": "item",
"attempts": 20,
"changed": true,
"cmd": [
"ovn-nbctl",
"--may-exist",
"lsp-add",
"node01-local",
"node01-ovs-phy"
],
"delta": "0:00:00.010396",
"end": "2020-07-21 11:07:41.912971",
"item": "node01",
"rc": 1,
"start": "2020-07-21 11:07:41.902575"
}
STDERR:
ovn-nbctl: node01-local: switch name not found
Hi, I was trying to deploy a SampleApp and came across the following error in the producer pod:
Reason: UnexpectedAdmissionError
Message: Pod Allocate failed due to failed to write checkpoint file "kubelet_internal_checkpoint": mkdir /var: file exists, which is unexpected