omec-project / sdcore-helm-charts

Helm charts used for SD-Core packaging
2 stars 4 forks source link

nrf and upf issue #34

Open Sivanesh1992 opened 3 weeks ago

Sivanesh1992 commented 3 weeks ago

Hi team,

I am able to deploy sd core in kubernetes cluster but nrf and upf is not running please help to resolve the issue

logs also attached

sivanesh@sivanesh:sdcore-helm-charts$ helm install -n sdcore-5g --create-namespace -f values.yaml sdcore-5g ~/sdcore-helm-charts/sdcore-helm-charts NAME: sdcore-5g LAST DEPLOYED: Tue Jun 11 15:11:53 2024 NAMESPACE: sdcore-5g STATUS: deployed REVISION: 1 TEST SUITE: None NOTES: Notes - Instructions to use SD-Core application helm charts sivanesh@sivanesh:sdcore-helm-charts$ helm -n sdcore-5g ls NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION sdcore-5g sdcore-5g 1 2024-06-11 15:11:53.622497827 +0530 IST deployed sd-core-1.0.4

sivanesh@sivanesh:sdcore-helm-charts$ kubectl get pods -n sdcore-5g NAME READY STATUS RESTARTS AGE amf-84d4fdffc4-q8t9j 1/1 Running 0 16m ausf-77d44557b7-5dq4c 1/1 Running 0 16m init-net-pxvcn 1/1 Running 0 16m nrf-7954f6b76b-5lkpw 0/1 CrashLoopBackOff 7 (94s ago) 16m nssf-577cb97596-xjnmq 1/1 Running 0 16m pcf-784fcff776-xdp72 1/1 Running 0 16m simapp-77dd644df5-2qwqv 1/1 Running 0 16m smf-99c65dddc-4kx8j 1/1 Running 0 16m udm-bf79b4865-h6zv8 1/1 Running 0 16m udr-86c84b99d5-d2kwx 1/1 Running 0 16m upf-0 0/5 Pending 0 16m webui-7c87966f44-f4wt2 1/1 Running 0 16m

$ kubectl logs upf-0 --namespace=sdcore-5g Defaulted container "bessd" out of: bessd, routectl, web, pfcp-agent, arping, bess-init (init) nrf.log

gab-arrobo commented 3 weeks ago

The nrf error seems to be related to the 5g-control-plane dependencies. Are the Charts for mongodb and kafka downloaded/deployed? Regarding the upf, can you please share the describe for the upf and the log for the bessd and bess-init containers?

Sivanesh1992 commented 2 weeks ago

Hi @gab-arrobo thankyou kindly find the below upf log.

sdcore-helm-charts$ kubectl describe pod upf-0 --namespace=sdcore-5g Name: upf-0 Namespace: sdcore-5g Priority: 0 Service Account: default Node: Labels: app=upf apps.kubernetes.io/pod-index=0 controller-revision-hash=upf-7f8f55f48c release=sdcore-5g statefulset.kubernetes.io/pod-name=upf-0 Annotations: k8s.v1.cni.cncf.io/networks: [ { "name": "access-net", "interface": "access", "ips": ["192.168.252.3/24"] }, { "name": "core-net", "interface": "core", "ips": ["192.16... Status: Pending IP:
IPs: Controlled By: StatefulSet/upf Init Containers: bess-init: Image: omecproject/upf-epc-bess:rel-1.4.1 Port: Host Port: Command: sh -xec Args: ip route replace 192.168.251.0/24 via 192.168.252.1; ip route replace default via 192.168.2.151 metric 110; iptables -I OUTPUT -p icmp --icmp-type port-unreachable -j DROP; Limits: cpu: 128m memory: 64Mi Requests: cpu: 128m memory: 64Mi Environment: Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-gxgj8 (ro) Containers: bessd: Image: omecproject/upf-epc-bess:rel-1.4.1 Port: Host Port: Command: /bin/bash -xc Args: bessd -f --allow="$PCIDEVICE_INTEL_COM_INTEL_SRIOV_DPDK" --grpc_url=0.0.0.0:10514 Limits: cpu: 2 hugepages-1Gi: 2Gi intel.com/intel_sriov_vfio: 2 memory: 1Gi Requests: cpu: 2 hugepages-1Gi: 2Gi intel.com/intel_sriov_vfio: 2 memory: 1Gi Liveness: tcp-socket :10514 delay=15s timeout=1s period=20s #success=1 #failure=3 Environment: CONF_FILE: /etc/bess/conf/upf.jsonc Mounts: /dev/hugepages from hugepages (rw) /etc/bess/conf from configs (rw) /pod-share from shared-app (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-gxgj8 (ro) routectl: Image: omecproject/upf-epc-bess:rel-1.4.1 Port: Host Port: Command: /opt/bess/bessctl/conf/route_control.py Args: -i access core Limits: cpu: 256m memory: 128Mi Requests: cpu: 256m memory: 128Mi Environment: PYTHONUNBUFFERED: 1 Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-gxgj8 (ro) web: Image: omecproject/upf-epc-bess:rel-1.4.1 Port: Host Port: Command: /bin/bash -xc bessctl http 0.0.0.0 8000 Limits: cpu: 256m memory: 128Mi Requests: cpu: 256m memory: 128Mi Environment: Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-gxgj8 (ro) pfcp-agent: Image: omecproject/upf-epc-pfcpiface:rel-1.4.1 Port: Host Port: Command: pfcpiface Args: -config /tmp/conf/upf.jsonc Limits: cpu: 256m memory: 128Mi Requests: cpu: 256m memory: 128Mi Environment: Mounts: /pod-share from shared-app (rw) /tmp/conf from configs (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-gxgj8 (ro) arping: Image: busybox:stable Port: Host Port: Command: sh -xc Args: while true; do

arping does not work - BESS graph is still disconnected

    #arping -c 2 -I access 192.168.252.1
    #arping -c 2 -I core 192.168.2.151
    ping -c 2 192.168.252.1
    ping -c 2 192.168.2.151
    sleep 10
  done

Limits:
  cpu:     128m
  memory:  64Mi
Requests:
  cpu:        128m
  memory:     64Mi
Environment:  <none>
Mounts:
  /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-gxgj8 (ro)

Conditions: Type Status PodScheduled False Volumes: configs: Type: ConfigMap (a volume populated by a ConfigMap) Name: upf Optional: false shared-app: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium:
SizeLimit: hugepages: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: HugePages SizeLimit: kube-api-access-gxgj8: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: Guaranteed Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message


Warning FailedScheduling 2m33s default-scheduler 0/1 nodes are available: 1 Insufficient hugepages-1Gi, 1 Insufficient intel.com/intel_sriov_vfio. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod. Warning FailedScheduling 2m32s default-scheduler 0/1 nodes are available: 1 Insufficient hugepages-1Gi, 1 Insufficient intel.com/intel_sriov_vfio. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.

Sivanesh1992 commented 2 weeks ago

I am not able to get upf log getting below output . please give me any other command i will share the output

$ kubectl logs upf-0 --namespace=sdcore-5g Defaulted container "bessd" out of: bessd, routectl, web, pfcp-agent, arping, bess-init (init)

gab-arrobo commented 2 weeks ago

Based on the warning message, it seems that you are trying to deploy the UPF in dpdk mode but it look like the sriov resources are not created/available. Are you trying to deploy the upf in dpdk mode or af_packet mode? If you need to deploy it in af_packet mode (easiest way to deploy the UPF), you need to adjust your values.yaml file accordingly. You can see details in this file: https://github.com/opennetworkinglab/aether-5gc/blob/master/roles/core/templates/sdcore-5g-values.yaml#L295

gab-arrobo commented 1 week ago

@Sivanesh1992, is this still an issue? or can it be closed?

Sivanesh1992 commented 1 week ago

Still I am facing same issue

Get Outlook for Androidhttps://aka.ms/AAb9ysg


From: gab-arrobo @.> Sent: Tuesday, June 25, 2024 5:20:05 AM To: omec-project/sdcore-helm-charts @.> Cc: Sivanesh Kumar K @.>; Mention @.> Subject: [EXTERNAL] Re: [omec-project/sdcore-helm-charts] nrf and upf issue (Issue #34)

CAUTION ON EXTERNAL e-MAIL: This message has originated from an external source. Use proper judgment and caution when opening such attachments, clicking links, or responding to this email. Report this mail (using the applicable "Report Mail" buttons) if you believe it includes malicious content or is a phishing attempt and not sure of it's source.

@Sivanesh1992https://github.com/Sivanesh1992, is this still an issue? or can it be closed?

— Reply to this email directly, view it on GitHubhttps://github.com/omec-project/sdcore-helm-charts/issues/34#issuecomment-2187646402, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AW4XLCWD2ZJIADOJAGLOCN3ZJCWC3AVCNFSM6AAAAABJD7PJPWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBXGY2DMNBQGI. You are receiving this because you were mentioned.Message ID: @.***>

Mobiveil INC., CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain proprietary confidential or privileged information or otherwise be protected by law. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please notify the sender and destroy all copies and the original message.

gab-arrobo commented 1 week ago

Which issue is still there? nrf? upf? both?

Sivanesh1992 commented 1 week ago

Both nrf and upf having issue i am getting same error

Sivanesh1992 commented 1 week ago

Below commit i am using sdcore-helm-charts$ git log commit 4d4f545809fa02f89507d6dbdac218d46ba4fb60 (HEAD -> main, origin/main, origin/HEAD) Author: gab-arrobo gabriel.arrobo@intel.com Date: Tue Jun 4 09:26:23 2024 -0700

Update versions for Docker images and Helm Charts (#32)

commit eb5b92ecd140b5bfcfb11b7b2a6349766d75f471 Author: gab-arrobo gabriel.arrobo@intel.com Date: Fri May 17 09:22:48 2024 -0700

Improved solution to pass PCI address to BESS when UPF is deployed in DPDK mode (#31)

* Improved solution to pass PCI address to BESS when UPF is deployed in DPDK mode

* Create a new version for the charts/changes to be published

let me try with latest code and let you know

gab-arrobo commented 1 week ago

Use the latest charts and share the describe and log for the nrf (after understanding what is happening with the nrf we can move to check the upf

Sivanesh1992 commented 1 week ago

i cloned latest code sdcore-helm-charts$ git log commit 63be36614fdffcd593b28e606e6ff36a341cb0f0 (HEAD -> main, origin/main, origin/HEAD) Author: sureshmarikkannu 115574144+sureshmarikkannu@users.noreply.github.com Date: Mon Jun 24 21:33:11 2024 +0530

Added support to configure DPDK/SRIOV mode for UPF (#38)

* Added support to configure DPDK/SRIOV mode for UPF

Now amf ,nrf ,upf ,smf is getting error

s$ kubectl get pods -n sdcore-5g NAME READY STATUS RESTARTS AGE amf-57ffdf68d7-dpsgh 0/1 ImagePullBackOff 0 5m37s ausf-77d44557b7-kl22q 1/1 Running 0 5m37s init-net-hbzql 1/1 Running 0 5m37s kube-sriov-device-plugin-amd64-4qtx6 0/1 Init:ImagePullBackOff 0 5m37s nrf-7954f6b76b-bcsst 1/1 Running 5 (96s ago) 5m37s nssf-577cb97596-hn8hd 1/1 Running 0 5m37s pcf-784fcff776-rbhrb 1/1 Running 0 5m37s simapp-77dd644df5-p7pg5 1/1 Running 0 5m37s smf-7d9bdd45b8-xrfcq 0/1 ErrImagePull 0 5m37s udm-bf79b4865-gc2kx 1/1 Running 0 5m37s udr-86c84b99d5-88wxs 1/1 Running 0 5m37s upf-0 0/5 Pending 0 5m37s webui-7c87966f44-52kxh 1/1 Running 0 5m37s

Sivanesh1992 commented 1 week ago

nrf describe log:

$ kubectl describe pods nrf-7954f6b76b-tkv64 -n sdcore-5g Name: nrf-7954f6b76b-tkv64 Namespace: sdcore-5g Priority: 0 Service Account: nrf Node: minikube/192.168.49.2 Start Time: Tue, 25 Jun 2024 11:10:54 +0530 Labels: app=nrf pod-template-hash=7954f6b76b release=sdcore-5g Annotations: k8s.v1.cni.cncf.io/network-status: [{ "name": "bridge", "interface": "eth0", "ips": [ "10.244.0.240" ], "mac": "de:5e:eb:bb:01:e4", "default": true, "dns": {}, "gateway": [ "10.244.0.1" ] }] Status: Running IP: 10.244.0.240 IPs: IP: 10.244.0.240 Controlled By: ReplicaSet/nrf-7954f6b76b Containers: nrf: Container ID: docker://a24fabf7214223db5390fbcf94c9ae2a162a29aef0e9e422304da03d6621c534 Image: omecproject/5gc-nrf:rel-1.4.1 Image ID: docker-pullable://omecproject/5gc-nrf@sha256:306f49b987981f0102acd8368919b493e375c1d7bd07dfe0d5e0aaf0373d43eb Port: Host Port: Command: /free5gc/script/nrf-run.sh State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 134 Started: Tue, 25 Jun 2024 11:36:37 +0530 Finished: Tue, 25 Jun 2024 11:37:07 +0530 Ready: False Restart Count: 9 Environment: GRPC_GO_LOG_VERBOSITY_LEVEL: 99 GRPC_GO_LOG_SEVERITY_LEVEL: info GRPC_TRACE: all GRPC_VERBOSITY: debug POD_IP: (v1:status.podIP) MANAGED_BY_CONFIG_POD: true Mounts: /free5gc/config from nf-config (rw) /free5gc/script/nrf-run.sh from run-script (rw,path="nrf-run.sh") /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-gfmbg (ro) Conditions: Type Status PodReadyToStartContainers True Initialized True Ready False ContainersReady False PodScheduled True Volumes: run-script: Type: ConfigMap (a volume populated by a ConfigMap) Name: nrf Optional: false nf-config: Type: ConfigMap (a volume populated by a ConfigMap) Name: nrf Optional: false kube-api-access-gfmbg: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: BestEffort Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message


Normal Scheduled 28m default-scheduler Successfully assigned sdcore-5g/nrf-7954f6b76b-tkv64 to minikube Normal AddedInterface 28m multus Add eth0 [10.244.0.240/16] from bridge Normal Pulled 24m (x5 over 28m) kubelet Container image "omecproject/5gc-nrf:rel-1.4.1" already present on machine Normal Created 24m (x5 over 28m) kubelet Created container nrf Normal Started 24m (x5 over 28m) kubelet Started container nrf Warning BackOff 3m1s (x96 over 27m) kubelet Back-off restarting failed container nrf in pod nrf-7954f6b76b-tkv64_sdcore-5g(4acb2c69-f8af-4915-801f-fbe2db669a6b)

Sivanesh1992 commented 1 week ago

please let me know any details required

Sivanesh1992 commented 1 week ago

i am getting below amf and smf logs

:sdcore-helm-charts$ kubectl logs amf-57ffdf68d7-xz5lh --namespace=sdcore-5g Error from server (BadRequest): container "amf" in pod "amf-57ffdf68d7-xz5lh" is waiting to start: trying and failing to pull image sivanesh@sivanesh:sdcore-helm-charts$ kubectl logs smf-7d9bdd45b8-np5z6 --namespace=sdcore-5g Defaulted container "smf" out of: smf, wait-smf-module (init) Error from server (BadRequest): container "smf" in pod "smf-7d9bdd45b8-np5z6" is waiting to start: trying and failing to pull image