Closed Puneet1726 closed 1 week ago
How are you deploying the SD-Core
? are you directly using the Helm Charts (e.g., helm install ...
)? Are you overriding the values.yaml file(s)? Can you please take a look at issue #34 and see if this is related?
I had a look at issue 34 before raising this issue. This seems to be different issue. I am deploying using helm with custom values and pod is getting stuck in "pd initialing stage". Only UPF has an issue and other components are running fine.
I am running on a ubuntu VM and enabled host network on the pod.
I can send the values file if required.
I can send the values file if required.
Yes, please share the values file you are using to override.
Can you please share the output of kubectl -n <namespace> describe pod upf-0
?
Attached is the pod details poddetails.txt
I am going to try to reproduce the issue later today/tonight using the values file you provided
BTW, have you tried to deploy the UPF using the default number of resources (as shown below)? That is, does the machine where you are trying to deploy the UPF have at least 10 cores? Also, why do you need to assign 10 cores if only 1 worker is configured in the upf.jsonc
file/setting?
- cpu: 10
- memory: 10Gi
+ cpu: 2
+ memory: 2Gi
I am going to try to replicate the issue later today/tonight.
Hi @Puneet1726,
I just tried deploying the SD-Core using your values.txt
as reference and everything seems to be working fine. As you can see below, all pods get deployed as expected
Here are a few comments for your reference:
values.txt
(which I renamed it to sdcore-override.yaml
just for reference) and modified the values in the bess-upf/values.yaml
nodeSelectors
because it is not relevant in this testhelm dep up
, correct?user@ubuntu22:~/sdcore-helm-charts$ helm install omec ./sdcore-helm-charts/ -n omec
nodeSelectors
, make sure the "tag" is properly set in the specific K8s node where you are expecting to deploy the UPFdiff --git a/sdcore-override.yaml b/sdcore-override.yaml
index 235be86..e779fee 100644
--- a/sdcore-override.yaml
+++ b/sdcore-override.yaml
@@ -1,8 +1,8 @@
images:
- repository: #default docker hub
+ repository: "" #default docker hub
tags:
- bess: omec/upf-epc-bess:rel-1.4.1
- pfcpiface: omec/upf-epc-pfcpiface:rel-1.4.1
+ bess: omecproject/upf-epc-bess:rel-1.4.1
+ pfcpiface: omecproject/upf-epc-pfcpiface:rel-1.4.1
tools: busybox:stable
pullPolicy: IfNotPresent
# Secrets must be manually created in the namespace.
@@ -10,7 +10,7 @@ images:
# - name: aether.registry
nodeSelectors:
- enabled: true
+ enabled: false
upf:
label: kubernetes.io/hostname
value: k8s-upf
@@ -67,30 +67,30 @@ config:
# Dynamic IP allocation is not supported yet
# Custom routes inside UPF
routes:
- - to: 10.100.2.75/32
- via: 10.100.0.1
+ - to: 10.154.48.197
+ via: 169.254.1.1
enb:
- subnet: 10.100.0.0/16
+ subnet: 192.168.251.0/24
access:
ipam: static
cniPlugin: macvlan
# Provide sriov resource name when sriov is enabled
#resourceName: "intel.com/intel_sriov_vfio"
- gateway: 10.100.0.1
- ip: 10.100.2.55/24
+ gateway: 192.168.252.1
+ ip: 192.168.252.3/24
#mac:
#vlan:
- iface: enp1s0
+ iface: ens3
core:
ipam: static
cniPlugin: macvlan
# Provide sriov resource name when sriov is enabled
#resourceName: "intel.com/intel_sriov_vfio"
- gateway: 10.100.0.1
- ip: 10.100.3.55/24
+ gateway: 192.168.250.1
+ ip: 192.168.250.3/24
#mac:
#vlan:
- iface: enp1s0
+ iface: ens3
cfgFiles:
upf.jsonc:
mode: af_packet
I was just playing around with cpu cores. It was 2G only.
I tried with helm dep up. It is not working. I am facing the issue only with UPF remaining components are up and running.
Just wanted to know any kernel module to be loaded for it to work? Also any specific IP ranges we need to use? I have weave as my network for kubernetes cluster.
Can you help me with the basic network configuration required on the node?
I am trying to deploy all the components on a one node cluster using minikube.
5g control planes are deployed successfully and bess was failing with below error.
helm install -n omec -f values.yaml upf . Error: INSTALLATION FAILED: unable to build kubernetes objects from release manifest: [resource mapping not found for name: "access-net" namespace: "" from "": no matches for kind "NetworkAttachmentDefinition" in version "k8s.cni.cncf.io/v1" ensure CRDs are installed first, resource mapping not found for name: "core-net" namespace: "" from "": no matches for kind "NetworkAttachmentDefinition" in version "k8s.cni.cncf.io/v1" ensure CRDs are installed first]
I have deployed multus cni plugin and helm install was fine after that.
https://github.com/k8snetworkplumbingwg/multus-cni
but pod is failing while initializing
QoS Class: Guaranteed
Node-Selectors:
Normal Scheduled 22s default-scheduler Successfully assigned omec/upf-0 to minikube Normal AddedInterface 20s multus Add eth0 [10.244.0.133/16] from bridge Warning FailedCreatePodSandBox 19s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "0218e8cc68598c5ed4db6f92bb14a6ede19fd7c1e6887792c031bcfcd8718982" network for pod "upf-0": networkPlugin cni failed to set up pod "upf-0_omec" network: plugin type="multus-shim" name="multus-cni-network" failed (add): CmdAdd (shim): CNI request failed with status 400: 'ContainerID:"0218e8cc68598c5ed4db6f92bb14a6ede19fd7c1e6887792c031bcfcd8718982" Netns:"/proc/97390/ns/net" IfName:"eth0" Args:"IgnoreUnknown=1;K8S_POD_NAMESPACE=omec;K8S_POD_NAME=upf-0;K8S_POD_INFRA_CONTAINER_ID=0218e8cc68598c5ed4db6f92bb14a6ede19fd7c1e6887792c031bcfcd8718982" Path:"" ERRORED: error configuring pod [omec/upf-0] networking: [omec/upf-0/:access-net]: error adding container to network "access-net": Link not found ': StdinData: {"capabilities":{"portMappings":true},"clusterNetwork":"/host/etc/cni/net.d/1-k8s.conflist","cniVersion":"0.3.1","logLevel":"verbose","logToStderr":true,"name":"multus-cni-network","runtimeConfig":{"portMappings":[]},"type":"multus-shim"} Normal AddedInterface 18s multus Add eth0 [10.244.0.134/16] from bridge
my pod ip address range is 10.244.0.0 and my values are below
enb:
subnet: 10.244.0.0/16
access:
ipam: static
cniPlugin: macvlan
# Provide sriov resource name when sriov is enabled
#resourceName: "intel.com/intel_sriov_vfio"
gateway: 10.244.0.1
ip: 10.244.0.100
#mac:
#vlan:
iface: enp1s0
core:
ipam: static
cniPlugin: macvlan
# Provide sriov resource name when sriov is enabled
#resourceName: "intel.com/intel_sriov_vfio"
gateway: 10.244.0.1
ip: 10.244.0.134
Am I missing any configuration? Sorry I might be asking some basic questions as I am new to this setup
Can you please share the output for kubectl -n <your-namespace> logs upf-0 -c bess-init
? Also, I understand that interface enp1s0
exists in your system, correct?
I was just playing around with cpu cores. It was 2G only.
I tried with helm dep up. It is not working. I am facing the issue only with UPF remaining components are up and running.
Just wanted to know any kernel module to be loaded for it to work? Also any specific IP ranges we need to use? I have weave as my network for kubernetes cluster.
Can you help me with the basic network configuration required on the node?
Can you use Aether-OnRamp
to deploy the Kubernetes cluster (make aether-k8s-install)? That is, follow the instructions from this link all the way to the deploy kubernetes
section. After that, try to deploy the SD-Core using helm (as you have been trying to do: helm install ...
)
Looks like we do not need to install multus cni. But if I do helm install it will fail for CRD plugin I am able run init container now but still it shows pod initializing.
I checked the bessd container log from the the node.
Log: 2024-08-26T10:39:46.539919461Z stdout F + bessd -m 0 -f --allow= --grpc_url=0.0.0.0:10514 2024-08-26T10:39:46.676834969Z stdout F I0826 10:39:46.676641 17 main.cc:64] Launching BESS daemon in process mode... 2024-08-26T10:39:46.676850261Z stdout F I0826 10:39:46.676705 17 main.cc:77] bessd v1.0.0-dirty 2024-08-26T10:39:46.679592579Z stdout F I0826 10:39:46.679409 17 bessd.cc:458] Loading plugin (attempt 1): /usr/bin/modules/sequential_update.so 2024-08-26T10:39:46.683026537Z stdout F I0826 10:39:46.682772 17 dpdk.cc:187] Initializing DPDK EAL with options: ["bessd", "--main-lcore", "127", "--lcore", "127@0-15", "--no-shconf", "--legacy-mem", "--no-huge", "-m", "512"] 2024-08-26T10:39:46.693036999Z stdout F EAL: Detected CPU lcores: 16 2024-08-26T10:39:46.693054559Z stdout F EAL: Detected NUMA nodes: 1 2024-08-26T10:39:46.693057689Z stdout F EAL: Detected static linkage of DPDK 2024-08-26T10:39:46.696361322Z stdout F EAL: Selected IOVA mode 'VA' 2024-08-26T10:39:46.696380183Z stdout F EAL: VFIO support initialized 2024-08-26T10:39:46.802555434Z stdout F EAL: Probe PCI driver: net_virtio (1af4:1041) device: 0000:01:00.0 (socket -1) 2024-08-26T10:39:46.802572266Z stdout F eth_virtio_pci_init(): Failed to init PCI device 2024-08-26T10:39:46.802575375Z stdout F EAL: Requested device 0000:01:00.0 cannot be used 2024-08-26T10:39:46.802577346Z stdout F TELEMETRY: No legacy callbacks, legacy socket not created 2024-08-26T10:39:46.802579975Z stdout F Segment 0-0: IOVA:0x100633000, len:4096, virt:0x100633000, socket_id:0
. Assuming a single-node system... 2024-08-26T10:39:47.309223742Z stdout F ^[[0;33mW0826 10:39:47.309146 17 packet_pool.cc:49] Hugepage is disabled! Creating PlainPacketPool for 262144 packets on node 0 2024-08-26T10:39:47.309227832Z stdout F ^[[mI0826 10:39:47.309160 17 packet_pool.cc:74] PacketPool0 requests for 262144 packets 2024-08-26T10:39:47.625036882Z stdout F I0826 10:39:47.624863 17 packet_pool.cc:161] PacketPool0 has been created with 262144 packets 2024-08-26T10:39:47.625310462Z stdout F I0826 10:39:47.625192 17 pmd.cc:74] 0 DPDK PMD ports have been recognized: 2024-08-26T10:39:47.625315654Z stdout F I0826 10:39:47.625232 17 vport.cc:320] vport: BESS kernel module is not loaded. Loading... 2024-08-26T10:39:47.626518205Z stdout F sh: 1: insmod: not found 2024-08-26T10:39:47.626596558Z stdout F ^[[0;33mW0826 10:39:47.626513 17 vport.cc:332] Cannot load kernel module /usr/bin/kmod/bess.ko 2024-08-26T10:39:47.627237405Z stdout F ^[[mI0826 10:39:47.627128 17 bessctl.cc:1928] Server listening on 0.0.0.0:10514 131095,1 Bot
I have attached the latest values file values-1.txt
Based on the upf's log and describe, it looks like you are making changes to bess
and/or pfcp agent (upf)
and you are building "local" images, correct? Or are you just using a local "mirror" of the images from DockerHub?
It is the same image from docker hub.. I have uploaded it to our registry.. I am only modifying values file.. I am using same tag as docker hub.. not sure what is the issue
Do you see any issue in the log or is it just a warning?
If I do helm install it will fail saying CRD is not present but there is already a network plugin part of bess upf templates…. I am doing workaround to bypass CRD issue.
Can you please provide a git diff between the first values file and the second/last values file you provided? I want to see the difference. As I mentioned before, I deployed the UPF using the initial/first values file you provided by properly adjusting certain parameters to match with my system. Besides that, I do not see any problem. BTW, trying to help in this way will be challenging for me because I do not have the details/specifics of what exactly you are doing. I think it is better to have a live debugging session. Can you join the Slack channel (use this link: https://aetherproject.org/contact-us/)?
Submitted the form
Please provide your email address?
Email: naikpuneet@gmail.com
Hi @gab-arrobo , Please send the slack details
Hi @gab-arrobo , Please send the slack details
I was told that yesterday an email invite to join Slack was sent to you. Did not you receive it?
@gab-arrobo : As discussed, I have setup new one node cluster and deployed upf using helm. upf is installed now. Thanks for all the support.
@gab-arrobo : As discussed, I have setup new one node cluster and deployed upf using helm. upf is installed now. Thanks for all the support.
BTW, I see you are not using the latest Helm Charts. I strongly recommend using the latest version as it includes several improvements such as new Docker images and nrf caching enabled in some NFs
Also, should I close this issue? or feel free to close it.
@gab-arrobo : I will pull the latest helm chart and merge it to our repo. I will close it once we perform sanity test on the cluster.
@Puneet1726, any update on this?
We can close this issue.
Hi Team,
I am deploying UPF to my dev cluster but bessd container is not starting properly and pod will be in PodInitializing status forever.
I can see the log from crictl logs on the node.
But pod is not giving the correct status. Do I need to configure anything else?
Thanks in advance