edgecomllc / eupf

5G User Plane Function (UPF) based on eBPF
Apache License 2.0
99 stars 18 forks source link

bind: cannot assign requested address #560

Open svinson1121 opened 2 months ago

svinson1121 commented 2 months ago

Hello,

I'm running into a strange issue when using XDPAttachMode:native with the PFCP and N3 on the same network interface.

oot@upf3p:~# eupf --config /etc/eupf/config.yaml 2024/07/10 18:31:05 Startup config: map[api_address::8080 echo_interval:10 far_map_size:1024 feature_ftup:true feature_ueip:true gtp_peer:[] heartbeat_interval:5 heartbeat_retries:10 heartbeat_timeout:10 interface_name:[enp4s0f0 enp4s0f1] logging_level:debug metrics_address::9090 n3_address:10.90.250.28 pdr_map_size:1024 pfcp_address:10.90.250.28:8805 pfcp_node_id:10.90.250.28 qer_map_size:1024 resize_ebpf_maps:false teid_pool:65536 ueip_pool:10.45.0.0/24 xdp_attach_mode:native] 2024/07/10 18:31:05 Apply eUPF config: {InterfaceName:[enp4s0f0 enp4s0f1] XDPAttachMode:native ApiAddress::8080 PfcpAddress:10.90.250.28:8805 PfcpNodeId:10.90.250.28 MetricsAddress::9090 N3Address:10.90.250.28 GtpPeer:[] EchoInterval:10 QerMapSize:1024 FarMapSize:1024 PdrMapSize:1024 EbpfMapResize:false HeartbeatRetries:10 HeartbeatInterval:5 HeartbeatTimeout:10 LoggingLevel:debug UEIPPool:10.45.0.0/24 FTEIDPool:65536 FeatureUEIP:true FeatureFTUP:true} 2024/07/10 18:31:05 INF Attached XDP program to iface "enp4s0f0" (index 6) 2024/07/10 18:31:06 INF Attached XDP program to iface "enp4s0f1" (index 7) 2024/07/10 18:31:06 INF Initialize resources: UEIP pool (CIDR: "10.45.0.0/24"), TEID pool (size: 65536) 2024/07/10 18:31:06 WRN Can't listen UDP address: listen udp 10.90.250.28:8805: bind: cannot assign requested address 2024/07/10 18:31:06 FTL Could not create PFCP connection: listen udp 10.90.250.28:8805: bind: cannot assign requested address

if I keep restarting eUPF it will eventually start and pass data.

root@upf3p:~# eupf --config /etc/eupf/config.yaml 2024/07/10 18:31:18 Startup config: map[api_address::8080 echo_interval:10 far_map_size:1024 feature_ftup:true feature_ueip:true gtp_peer:[] heartbeat_interval:5 heartbeat_retries:10 heartbeat_timeout:10 interface_name:[enp4s0f0 enp4s0f1] logging_level:debug metrics_address::9090 n3_address:10.90.250.28 pdr_map_size:1024 pfcp_address:10.90.250.28:8805 pfcp_node_id:10.90.250.28 qer_map_size:1024 resize_ebpf_maps:false teid_pool:65536 ueip_pool:10.45.0.0/24 xdp_attach_mode:native] 2024/07/10 18:31:18 Apply eUPF config: {InterfaceName:[enp4s0f0 enp4s0f1] XDPAttachMode:native ApiAddress::8080 PfcpAddress:10.90.250.28:8805 PfcpNodeId:10.90.250.28 MetricsAddress::9090 N3Address:10.90.250.28 GtpPeer:[] EchoInterval:10 QerMapSize:1024 FarMapSize:1024 PdrMapSize:1024 EbpfMapResize:false HeartbeatRetries:10 HeartbeatInterval:5 HeartbeatTimeout:10 LoggingLevel:debug UEIPPool:10.45.0.0/24 FTEIDPool:65536 FeatureUEIP:true FeatureFTUP:true} 2024/07/10 18:31:19 INF Attached XDP program to iface "enp4s0f0" (index 6) 2024/07/10 18:31:19 INF Attached XDP program to iface "enp4s0f1" (index 7) 2024/07/10 18:31:19 INF Initialize resources: UEIP pool (CIDR: "10.45.0.0/24"), TEID pool (size: 65536) 2024/07/10 18:31:19 INF Starting PFCP connection: 10.90.250.28:8805 with Node ID: 10.90.250.28 and N3 address: 10.90.250.28 [GIN-debug] [WARNING] Creating an Engine instance with the Logger and Recovery middleware already attached.

[GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production.

[GIN-debug] GET /api/v1/health --> github.com/edgecomllc/eupf/cmd/api/rest.(ApiHandler).InitRoutes.func1 (4 handlers) [GIN-debug] GET /api/v1/xdp_stats --> github.com/edgecomllc/eupf/cmd/api/rest.(ApiHandler).displayXdpStatistics-fm (4 handlers) [GIN-debug] GET /api/v1/packet_stats --> github.com/edgecomllc/eupf/cmd/api/rest.(ApiHandler).displayPacketStats-fm (4 handlers) [GIN-debug] GET /api/v1/config --> github.com/edgecomllc/eupf/cmd/api/rest.(ApiHandler).displayConfig-fm (4 handlers) [GIN-debug] POST /api/v1/config --> github.com/edgecomllc/eupf/cmd/api/rest.(ApiHandler).editConfig-fm (4 handlers) [GIN-debug] GET /api/v1/uplink_pdr_map/:id --> github.com/edgecomllc/eupf/cmd/api/rest.(ApiHandler).getUplinkPdrValue-fm (4 handlers) [GIN-debug] PUT /api/v1/uplink_pdr_map/:id --> github.com/edgecomllc/eupf/cmd/api/rest.(ApiHandler).setUplinkPdrValue-fm (4 handlers) [GIN-debug] GET /api/v1/qer_map --> github.com/edgecomllc/eupf/cmd/api/rest.(ApiHandler).listQerMapContent-fm (4 handlers) [GIN-debug] GET /api/v1/qer_map/:id --> github.com/edgecomllc/eupf/cmd/api/rest.(ApiHandler).getQerValue-fm (4 handlers) [GIN-debug] PUT /api/v1/qer_map/:id --> github.com/edgecomllc/eupf/cmd/api/rest.(ApiHandler).setQerValue-fm (4 handlers) [GIN-debug] GET /api/v1/far_map/:id --> github.com/edgecomllc/eupf/cmd/api/rest.(ApiHandler).getFarValue-fm (4 handlers) [GIN-debug] PUT /api/v1/far_map/:id --> github.com/edgecomllc/eupf/cmd/api/rest.(ApiHandler).setFarValue-fm (4 handlers) [GIN-debug] GET /api/v1/pfcp_associations --> github.com/edgecomllc/eupf/cmd/api/rest.(ApiHandler).listPfcpAssociations-fm (4 handlers) [GIN-debug] GET /api/v1/pfcp_associations/full --> github.com/edgecomllc/eupf/cmd/api/rest.(ApiHandler).listPfcpAssociationsFull-fm (4 handlers) [GIN-debug] GET /api/v1/pfcp_sessions --> github.com/edgecomllc/eupf/cmd/api/rest.(ApiHandler).listPfcpSessionsFiltered-fm (4 handlers) [GIN-debug] GET /swagger/any --> github.com/swaggo/gin-swagger.CustomWrapHandler.func1 (4 handlers) [GIN-debug] [WARNING] Creating an Engine instance with the Logger and Recovery middleware already attached.

[GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production.

[GIN-debug] GET /metrics --> github.com/edgecomllc/eupf/cmd/api/rest.(*ApiHandler).InitMetricsRoute.func1.1 (4 handlers) 2024/07/10 18:31:19 INF running on :9090 2024/07/10 18:31:19 INF running on :8080

I don't have this issue when using XDPAttachMode:generic or if I move the PFCP address to a none XDP interface

Network Cards are "Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)"

Thank you,

pirog-spb commented 2 months ago

Hi @svinson1121

Looks strange. May be this is due to fast restart of application when you have previous instance of eUPF holding the socket?

Try to check who is bound to socket after such an error: lsof -i UDP:8805

--BR Alex

svinson1121 commented 2 months ago

Hi @svinson1121

Looks strange. May be this is due to fast restart of application when you have previous instance of eUPF holding the socket?

Try to check who is bound to socket after such an error: lsof -i UDP:8805

--BR Alex

@pirog-spb

Hi Alex, Thank you for the reply, I gave that a go but it's not showing the socket in use. I can wait a long time between trying to restart eUPF and I get the same error.

This is running on a bare metal server OS: Ubuntu 22.04.4 LTS Linux upf3p 5.15.0-113-generic #123-Ubuntu SMP I am wondering if this might be a bug/issue with the ixgbe driver?

2024/07/10 21:05:57 Startup config: map[api_address::8080 echo_interval:10 far_map_size:1024 feature_ftup:true feature_ueip:true gtp_peer:[] heartbeat_interval:5 heartbeat_retries:10 heartbeat_timeout:10 interface_name:[enp4s0f0 enp4s0f1] logging_level:debug metrics_address::9090 n3_address:10.90.250.28 pdr_map_size:1024 pfcp_address:10.90.250.28:8805 pfcp_node_id:10.90.250.28 qer_map_size:1024 resize_ebpf_maps:false teid_pool:65536 ueip_pool:10.45.0.0/24 xdp_attach_mode:native] 2024/07/10 21:05:57 Apply eUPF config: {InterfaceName:[enp4s0f0 enp4s0f1] XDPAttachMode:native ApiAddress::8080 PfcpAddress:10.90.250.28:8805 PfcpNodeId:10.90.250.28 MetricsAddress::9090 N3Address:10.90.250.28 GtpPeer:[] EchoInterval:10 QerMapSize:1024 FarMapSize:1024 PdrMapSize:1024 EbpfMapResize:false HeartbeatRetries:10 HeartbeatInterval:5 HeartbeatTimeout:10 LoggingLevel:debug UEIPPool:10.45.0.0/24 FTEIDPool:65536 FeatureUEIP:true FeatureFTUP:true} 2024/07/10 21:05:58 INF Attached XDP program to iface "enp4s0f0" (index 6) 2024/07/10 21:05:58 INF Attached XDP program to iface "enp4s0f1" (index 7) 2024/07/10 21:05:58 INF Initialize resources: UEIP pool (CIDR: "10.45.0.0/24"), TEID pool (size: 65536) 2024/07/10 21:05:58 WRN Can't listen UDP address: listen udp 10.90.250.28:8805: bind: cannot assign requested address 2024/07/10 21:05:58 FTL Could not create PFCP connection: listen udp 10.90.250.28:8805: bind: cannot assign requested address root@upf3p:~# lsof -i UDP:8805 root@upf3p:~# lsof -i UDP:8805 root@upf3p:~# eupf --config /etc/eupf/config.yaml 2024/07/10 21:06:16 Startup config: map[api_address::8080 echo_interval:10 far_map_size:1024 feature_ftup:true feature_ueip:true gtp_peer:[] heartbeat_interval:5 heartbeat_retries:10 heartbeat_timeout:10 interface_name:[enp4s0f0 enp4s0f1] logging_level:debug metrics_address::9090 n3_address:10.90.250.28 pdr_map_size:1024 pfcp_address:10.90.250.28:8805 pfcp_node_id:10.90.250.28 qer_map_size:1024 resize_ebpf_maps:false teid_pool:65536 ueip_pool:10.45.0.0/24 xdp_attach_mode:native] 2024/07/10 21:06:16 Apply eUPF config: {InterfaceName:[enp4s0f0 enp4s0f1] XDPAttachMode:native ApiAddress::8080 PfcpAddress:10.90.250.28:8805 PfcpNodeId:10.90.250.28 MetricsAddress::9090 N3Address:10.90.250.28 GtpPeer:[] EchoInterval:10 QerMapSize:1024 FarMapSize:1024 PdrMapSize:1024 EbpfMapResize:false HeartbeatRetries:10 HeartbeatInterval:5 HeartbeatTimeout:10 LoggingLevel:debug UEIPPool:10.45.0.0/24 FTEIDPool:65536 FeatureUEIP:true FeatureFTUP:true} 2024/07/10 21:06:17 INF Attached XDP program to iface "enp4s0f0" (index 6) 2024/07/10 21:06:17 INF Attached XDP program to iface "enp4s0f1" (index 7) 2024/07/10 21:06:17 INF Initialize resources: UEIP pool (CIDR: "10.45.0.0/24"), TEID pool (size: 65536) 2024/07/10 21:06:17 WRN Can't listen UDP address: listen udp 10.90.250.28:8805: bind: cannot assign requested address 2024/07/10 21:06:17 FTL Could not create PFCP connection: listen udp 10.90.250.28:8805: bind: cannot assign requested address root@upf3p:~# lsof -i UDP:8805 root@upf3p:~# lsof -i UDP:8805

svinson1121 commented 2 months ago

I tested things on a different server and network card (but the same model network card) and had the same result. I plan on testing with a different model NIC that uses a different driver. But i also noticed in the documentation it says only generic mode is supported at this time, is this correct?

pirog-spb commented 2 months ago

Actually, we have tested eUPF in native mode with Mellanox ConnectX-5 NIC. And everything was pretty good. There were no such problem that you reported.

See here https://github.com/edgecomllc/eupf/blob/main/docs%2Fload-testing.md

--BR Alex

pirog-spb commented 2 months ago

But i also noticed in the documentation it says only generic mode is supported at this time, is this correct?

Could you please point out where you noticed that?

svinson1121 commented 2 months ago

But i also noticed in the documentation it says only generic mode is supported at this time, is this correct?

Could you please point out where you noticed that?

Hey Alex, IN the README.md, section: See startup parameters you might want to change

"UPF_XDP_ATTACH_MODE=generic XDP attach mode. Generic-only at the moment"

pirog-spb commented 2 months ago

But i also noticed in the documentation it says only generic mode is supported at this time, is this correct?

Could you please point out where you noticed that?

Hey Alex, IN the README.md, section: See startup parameters you might want to change

"UPF_XDP_ATTACH_MODE=generic XDP attach mode. Generic-only at the moment"

Aha, thanks. Will fix this notice. Native mode has been supported for a while.

--BR Alex

svinson1121 commented 2 months ago

I tried installing the latest ixgbe drivers from Intel, but there was no change. I then removed the NIC (used for my N3/N4 interface) from my Netplan configuration and set the IP via CLI. Now, I have no issue starting EUPF in native mode.

it looks to me like when the NIC is controlled by netplan it resets multiple times when eUPF is started and that extra delay is causing eUPF bind to fail because the NIC is not ready when eUPF tries to bind to it.

my N3/N4 is on NIC enp129s0f1 and i removed only that one NIC from netplan

NIC controlled by netplan:

2024-07-15T03:20:37.219669+00:00 eupg01 kernel: ixgbe 0000:81:00.1: removed PHC on enp129s0f1 2024-07-15T03:20:37.289664+00:00 eupg01 kernel: ixgbe 0000:81:00.1: Multiqueue Enabled: Rx Queue count = 32, Tx Queue count = 32 XDP Queue count = 32 2024-07-15T03:20:37.512670+00:00 eupg01 kernel: ixgbe 0000:81:00.1: registered PHC device on enp129s0f1 2024-07-15T03:20:37.630489+00:00 eupg01 systemd-networkd[1039]: enp129s0f1: Lost carrier 2024-07-15T03:20:37.631660+00:00 eupg01 kernel: ixgbe 0000:04:00.0: removed PHC on enp4s0f0 2024-07-15T03:20:37.681666+00:00 eupg01 kernel: ixgbe 0000:81:00.1 enp129s0f1: detected SFP+: 6 2024-07-15T03:20:37.816658+00:00 eupg01 kernel: ixgbe 0000:04:00.0: Multiqueue Enabled: Rx Queue count = 32, Tx Queue count = 32 XDP Queue count = 32 2024-07-15T03:20:38.054670+00:00 eupg01 kernel: pps pps0: new PPS source ptp4 2024-07-15T03:20:38.054689+00:00 eupg01 kernel: ixgbe 0000:04:00.0: registered PHC device on enp4s0f0 2024-07-15T03:20:38.101663+00:00 eupg01 kernel: ixgbe 0000:04:00.0: removed PHC on enp4s0f0 2024-07-15T03:20:38.294657+00:00 eupg01 kernel: ixgbe 0000:04:00.0: Multiqueue Enabled: Rx Queue count = 32, Tx Queue count = 32 XDP Queue count = 0 2024-07-15T03:20:38.466680+00:00 eupg01 kernel: pps pps0: new PPS source ptp4 2024-07-15T03:20:38.466699+00:00 eupg01 kernel: ixgbe 0000:04:00.0: registered PHC device on enp4s0f0 2024-07-15T03:20:38.507667+00:00 eupg01 kernel: ixgbe 0000:81:00.1: removed PHC on enp129s0f1 2024-07-15T03:20:38.653664+00:00 eupg01 kernel: ixgbe 0000:81:00.1: Multiqueue Enabled: Rx Queue count = 32, Tx Queue count = 32 XDP Queue count = 0 2024-07-15T03:20:38.811671+00:00 eupg01 kernel: ixgbe 0000:81:00.1: registered PHC device on enp129s0f1 2024-07-15T03:20:38.922522+00:00 eupg01 systemd-networkd[1039]: enp4s0f0: Lost carrier 2024-07-15T03:20:39.160669+00:00 eupg01 kernel: ixgbe 0000:81:00.1 enp129s0f1: detected SFP+: 6 2024-07-15T03:20:40.060661+00:00 eupg01 kernel: ixgbe 0000:81:00.1 enp129s0f1: NIC Link is Up 1 Gbps, Flow Control: RX/TX 2024-07-15T03:20:40.060750+00:00 eupg01 systemd-networkd[1039]: enp129s0f1: Gained carrier 2024-07-15T03:20:41.166660+00:00 eupg01 kernel: ixgbe 0000:04:00.0 enp4s0f0: NIC Link is Up 1 Gbps, Flow Control: None 2024-07-15T03:20:41.166706+00:00 eupg01 systemd-networkd[1039]: enp4s0f0: Gained carrier

NIC removed from netplan:

2024-07-15T03:21:23.062682+00:00 eupg01 kernel: ixgbe 0000:81:00.1: removed PHC on enp129s0f1 2024-07-15T03:21:23.133671+00:00 eupg01 kernel: ixgbe 0000:81:00.1: Multiqueue Enabled: Rx Queue count = 32, Tx Queue count = 32 XDP Queue count = 32 2024-07-15T03:21:23.356657+00:00 eupg01 kernel: ixgbe 0000:81:00.1: registered PHC device on enp129s0f1 2024-07-15T03:21:23.470673+00:00 eupg01 kernel: ixgbe 0000:04:00.0: removed PHC on enp4s0f0 2024-07-15T03:21:23.470634+00:00 eupg01 systemd-networkd[1039]: enp129s0f1: Lost carrier 2024-07-15T03:21:23.511662+00:00 eupg01 kernel: ixgbe 0000:81:00.1 enp129s0f1: detected SFP+: 6 2024-07-15T03:21:23.656658+00:00 eupg01 kernel: ixgbe 0000:04:00.0: Multiqueue Enabled: Rx Queue count = 32, Tx Queue count = 32 XDP Queue count = 32 2024-07-15T03:21:23.894659+00:00 eupg01 kernel: pps pps0: new PPS source ptp4 2024-07-15T03:21:23.894665+00:00 eupg01 kernel: ixgbe 0000:04:00.0: registered PHC device on enp4s0f0 2024-07-15T03:21:24.412591+00:00 eupg01 systemd-networkd[1039]: enp129s0f1: Gained carrier 2024-07-15T03:21:24.412673+00:00 eupg01 kernel: ixgbe 0000:81:00.1 enp129s0f1: NIC Link is Up 1 Gbps, Flow Control: RX/TX 2024-07-15T03:21:24.471599+00:00 eupg01 systemd-networkd[1039]: enp4s0f0: Lost carrier 2024-07-15T03:21:27.057513+00:00 eupg01 systemd-networkd[1039]: enp4s0f0: Gained carrier 2024-07-15T03:21:27.057665+00:00 eupg01 kernel: ixgbe 0000:04:00.0 enp4s0f0: NIC Link is Up 1 Gbps, Flow Control: None