omec-project / upf

4G/5G Mobile Core User Plane
181 stars 106 forks source link

Bess On AWS EKS #569

Closed infinitydon closed 2 years ago

infinitydon commented 2 years ago

Hi,

I am currently trying to install bess upf on EKS but I noticed some errors:

[ec2-user@ip-172-31-18-255 omec-user-plane]$ kubectl -n bess-upf logs upf-0 -c bessd | more
+ bessd -f -grpc-url=0.0.0.0:10514
I0311 13:12:14.615412    16 main.cc:62] Launching BESS daemon in process mode...
I0311 13:12:14.615460    16 main.cc:75] bessd unknown
I0311 13:12:14.616827    16 bessd.cc:456] Loading plugin (attempt 1): /bin/modules/sequential_update.so
I0311 13:12:14.618448    16 memory.cc:128] /sys/devices/system/node/possible not available. Assuming a single-node system...
I0311 13:12:14.618467    16 dpdk.cc:167] Initializing DPDK EAL with options: ["bessd", "--master-lcore", "127", "--lcore", "127@0,18", "--no-shconf", "--legacy-mem", "--socket-mem", "1024", "--huge-unlink"]
EAL: Detected 36 lcore(s)
EAL: Detected 1 NUMA nodes
Option --master-lcore is deprecated use main-lcore
EAL: Detected static linkage of DPDK
EAL: Selected IOVA mode 'PA'
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL:   using IOMMU type 8 (No-IOMMU)
EAL: Probe PCI driver: net_ena (1d0f:ec20) device: 0000:00:06.0 (socket 0)
EAL: Probe PCI driver: net_ena (1d0f:ec20) device: 0000:00:07.0 (socket 0)
EAL: Probe PCI driver: net_ena (1d0f:ec20) device: 0000:00:08.0 (socket 0)
EAL: No legacy callbacks, legacy socket not created
Segment 0-0: IOVA:0x16c0000000, len:1073741824, virt:0x140000000, socket_id:0, hugepage_sz:1073741824, nchannel:0, nrank:0 fd:9
I0311 13:12:16.472007    16 packet_pool.cc:49] Creating DpdkPacketPool for 262144 packets on node 0
I0311 13:12:16.472023    16 packet_pool.cc:70] PacketPool0 requests for 262144 packets
I0311 13:12:16.532382    16 packet_pool.cc:157] PacketPool0 has been created with 262144 packets
I0311 13:12:16.532553    16 pmd.cc:68] 3 DPDK PMD ports have been recognized:
I0311 13:12:16.532567    16 pmd.cc:92] DPDK port_id 0 (net_ena)   RXQ 32 TXQ 32  02:95:ae:ac:2c:1e  00000000:00:06.00 1d0f:ec20   numa_node 0
I0311 13:12:16.532573    16 pmd.cc:92] DPDK port_id 1 (net_ena)   RXQ 32 TXQ 32  02:43:a1:7d:29:fa  00000000:00:07.00 1d0f:ec20   numa_node 0
I0311 13:12:16.532579    16 pmd.cc:92] DPDK port_id 2 (net_ena)   RXQ 32 TXQ 32  02:c9:b6:9a:59:0a  00000000:00:08.00 1d0f:ec20   numa_node 0
I0311 13:12:16.532617    16 vport.cc:318] vport: BESS kernel module is not loaded. Loading...
sh: 1: insmod: not found
W0311 13:12:16.533284    16 vport.cc:330] Cannot load kernel module /bin/kmod/bess.ko
I0311 13:12:16.533310    16 bessctl.cc:1931] Server listening on 0.0.0.0:10514
I0311 13:12:17.323052    60 bessctl.cc:487] *** All workers have been paused ***
I0311 13:12:17.691282    78 worker.cc:319] Worker 0(0x7f87561fa400) is running on core 0 (socket 0)
I0311 13:12:17.699735    77 pmd.cc:392] Initializing Port:0 with memory from socket 0
I0311 13:12:17.807314   102 pmd.cc:200] port id: 3matches vdev: net_af_packet0,iface=access-vdev
W0311 13:12:17.807351   102 pmd.cc:389] Invalid socket, falling back...
I0311 13:12:17.807375   102 pmd.cc:392] Initializing Port:3 with memory from socket 0
I0311 13:12:17.853832   140 pmd.cc:392] Initializing Port:2 with memory from socket 0
I0311 13:12:17.963251   164 pmd.cc:200] port id: 4matches vdev: net_af_packet1,iface=core-vdev
W0311 13:12:17.963286   164 pmd.cc:389] Invalid socket, falling back...
I0311 13:12:17.963306   164 pmd.cc:392] Initializing Port:4 with memory from socket 0
I0311 13:12:18.299638   369 bessctl.cc:691] Checking scheduling constraints
E0311 13:12:18.299729   369 module.cc:224] Mismatch in number of workers for module accessMerge min required 1 max allowed 64 attached workers 0
E0311 13:12:18.299739   369 module.cc:224] Mismatch in number of workers for module accessSrcEther min required 1 max allowed 64 attached workers 0
E0311 13:12:18.299752   369 module.cc:224] Mismatch in number of workers for module access_measure min required 1 max allowed 64 attached workers 0
E0311 13:12:18.299758   369 module.cc:224] Mismatch in number of workers for module coreMerge min required 1 max allowed 64 attached workers 0
E0311 13:12:18.299770   369 module.cc:224] Mismatch in number of workers for module coreSrcEther min required 1 max allowed 64 attached workers 0
E0311 13:12:18.299777   369 module.cc:224] Mismatch in number of workers for module core_measure min required 1 max allowed 64 attached workers 0
W0311 13:12:18.300586   370 metadata.cc:77] Metadata attr timestamp/8 of module access_measure has no upstream module that sets the value!
W0311 13:12:18.300602   370 metadata.cc:77] Metadata attr timestamp/8 of module core_measure has no upstream module that sets the value!
I0311 13:12:18.300768   370 bessctl.cc:516] *** Resuming ***
I0311 13:12:23.603725   424 bessctl.cc:487] *** All workers have been paused ***

Particularly:

_I0311 13:12:18.299638 369 bessctl.cc:691] Checking scheduling constraints E0311 13:12:18.299729 369 module.cc:224] Mismatch in number of workers for module accessMerge min required 1 max allowed 64 attached workers 0 E0311 13:12:18.299739 369 module.cc:224] Mismatch in number of workers for module accessSrcEther min required 1 max allowed 64 attached workers 0 E0311 13:12:18.299752 369 module.cc:224] Mismatch in number of workers for module access_measure min required 1 max allowed 64 attached workers 0 E0311 13:12:18.299758 369 module.cc:224] Mismatch in number of workers for module coreMerge min required 1 max allowed 64 attached workers 0 E0311 13:12:18.299770 369 module.cc:224] Mismatch in number of workers for module coreSrcEther min required 1 max allowed 64 attached workers 0 E0311 13:12:18.299777 369 module.cc:224] Mismatch in number of workers for module coremeasure min required 1 max allowed 64 attached workers 0

This is the upf.json config:

{
    "": "Vdev or sim support. Enable `\"mode\": \"af_xdp\"` to enable AF_XDP mode, or `\"mode\": \"af_packet\"` to enable AF_PACKET mode, or `\"mode\": \"sim\"` to generate synthetic traffic from BESS's Source module",
    "": "mode: af_xdp",
    "": "mode: af_packet",
    "": "mode: sim",

    "table_sizes": {
        "": "Example sizes based on sim mode and 50K sessions. Customize as per your control plane",
        "": "50K per unique tuple, we send 4 unique PDR patterns",
        "pdrLookup": 50000,
        "": "4 PDRs per session",
        "flowMeasure": 200000,
        "": "there are 2 QERs and 2 entries per QER",
        "appQERLookup": 200000,
        "": "there is 1 session QER and 2 entries per session QER",
        "sessionQERLookup": 100000,
        "": "there are 3 FARs",
        "farLookup": 150000
    },

    "": "Set the log level to one of \"panic\", \"fatal\", \"error\", \"warning\", \"info\", \"debug\", \"trace\"",
    "log_level": "info",

    "": "Use the sim block to enable simulation using either Source module or via il_trafficgen",
    "sim": {
        "": "At this point we can simulate either N3/N6 or N3/N9 traffic, so choose n6 or n9 below",
        "core": "n6",
        "max_sessions": 50000,
        "start_ue_ip": "16.0.0.1",
        "start_enb_ip": "11.1.1.129",
        "start_aupf_ip": "13.1.1.199",
        "n6_app_ip": "6.6.6.6",
        "n9_app_ip": "9.9.9.9",
        "start_n3_teid": "0x30000000",
        "start_n9_teid": "0x90000000",
        "pkt_size": 128,
        "total_flows": 5000
    },

    "": "max IP frag table entries (for IPv4 reassembly). Update the line below to `\"max_ip_defrag_flows\": 1000` to enable",
    "": "max_ip_defrag_flows: 1000",

    "": "Update the line below to `\"ip_frag_with_eth_mtu\": 1518` to enable",
    "": "ip_frag_with_eth_mtu: 1518",

    "": "Enable hardware offload of checksum. Might disable vector PMD",
    "hwcksum": false,

    "": "Enable PDU Session Container extension",
    "gtppsc": false,

    "": "Enable Intel Dynamic Device Personalization (DDP)",
    "ddp": false,

    "": "Telemetrics-See this link for details: https://github.com/NetSys/bess/blob/master/bessctl/module_tests/timestamp.py",
    "measure_upf": true,

    "": "Whether to enable flow measurement feature",
    "measure_flow": false,

    "": "Gateway interfaces",
    "access": {
        "ifname": "ens803f2"
    },

    "": "UE IP Natting. Update the line below to `\"ip_masquerade\": \"<ip> [or <ip>]\"` to enable",
    "core": {
        "ifname": "ens803f3",
        "": "ip_masquerade: 18.0.0.1 or 18.0.0.2 or 18.0.0.3"
    },

    "": "Number of worker threads. Default: 1",
    "workers": 1,

    "": "Parameters for handling outgoing requests",
    "max_req_retries": 5,
    "resp_timeout": "2s",

    "": "Whether to enable Network Token Functions",
    "enable_ntf": false,

    "": "Whether to enable End Marker Support",
    "": "enable_end_marker: false",

    "": "Whether to enable Notify BESS feature",
    "": "enable_notify_bess: false",

    "": "Whether to enable P4Runtime feature",
    "enable_p4rt": false,
    "" : "conn_timeout: 1000",
    "" : "read_timeout: 25",
    "" : "notify_sockaddr: /tmp/notifycp",
    "" : "endmarker_sockaddr: /tmp/pfcpport",

    "": "Whether to enable UPF HeartBeatTimer feature",
    "enable_hbTimer": false,
    "": "heart_beat_interval: 5s",

    "qci_qos_config": [
        {
            "": "Default values for QERs with QCI/QFI not listed below",
            "qci": 0,
            "cbs": 50000,
            "ebs": 50000,
            "pbs": 50000,
            "burst_duration_ms": 10,
            "priority": 7
        },
        {
            "qci": 9,
            "cbs": 2048,
            "ebs": 2048,
            "pbs": 2048,
            "priority": 6
        },
        {
            "qci": 8,
            "cbs": 2048,
            "ebs": 2048,
            "pbs": 2048,
            "priority": 5
        }
    ],

    "": "Optional slice-wide meter rate limits",
    "slice_rate_limit_config": {
        "": "uplink policer",
        "n6_bps": 500000000,
        "n6_burst_bytes": 625000,
        "": "downlink policer",
        "n3_bps": 500000000,
        "n3_burst_bytes": 625000
    },

    "": "Control plane controller settings",
    "cpiface": {
        "peers": ["148.162.12.214"],
        "dnn": "internet",
        "http_port": "8080",
        "enable_ue_ip_alloc": false,
        "ue_ip_pool": "10.250.0.0/16",
        "" : "use_fqdn: true",
        "" : "hostname: upf-0"
    },

    "": "p4rtc interface settings",
    "p4rtciface": {
    "access_ip": "172.17.0.1/32",
    "p4rtc_server": "onos",
    "p4rtc_port": "51001",
    "ue_ip_pool": "10.250.0.0/24"
    }
}

POD status:

[ec2-user@ip-172-31-18-255 omec-user-plane]$ kubectl -n bess-upf get po
NAME    READY   STATUS    RESTARTS   AGE
upf-0   5/5     Running   0          11h
[ec2-user@ip-172-31-18-255 omec-user-plane]$

[ec2-user@ip-172-31-18-255 omec-user-plane]$ kubectl -n bess-upf get sts -o wide
NAME   READY   AGE   CONTAINERS                             IMAGES
upf    1/1     11h   bessd,routectl,web,pfcp-agent,arping   registry.aetherproject.org/proxy/omecproject/upf-epc-bess:master-ada6849,registry.aetherproject.org/proxy/omecproject/upf-epc-bess:master-ada6849,registry.aetherproject.org/proxy/omecproject/upf-epc-bess:master-ada6849,registry.aetherproject.org/proxy/omecproject/upf-epc-pfcpiface:master-ada6849,registry.aetherproject.org/tools/busybox:stable
infinitydon commented 2 years ago

Could it be that the upf is not detecting the right lcore to initialize?

github-actions[bot] commented 2 years ago

This issue has been stale for 30 days and will be closed in 5 days. Comment to keep it open.