Gradiant / 5g-charts

Helm charts for 5G Technologies
Apache License 2.0
110 stars 43 forks source link

Getting the "SCTP could not connect: Operation timed out" Error #114

Closed sam-sre closed 1 year ago

sam-sre commented 1 year ago

Hi

I'm following your tutorial for the Open5gs and UERANSIM here with all default values.. (should I edit any values?)

I have a 2 nodes K8s deployment with enough resources..

I'm using NFS provider for the PV/PVCs of the MongoDB.. PVC is Bound and consumed by MongoDB..

For some reason, the SCTP connection is not established ..

All Pods are Running and looking healthy..

kg po 
NAME                                              READY   STATUS    RESTARTS      AGE
nfs-subdir-external-provisioner-ff8fdc59b-rvzd9   1/1     Running   0             37m
open5gs-amf-74867f4d79-rgmkm                      1/1     Running   0             31m
open5gs-ausf-6f7b99444d-jp89j                     1/1     Running   0             31m
open5gs-bsf-78cbddcf6d-kstb2                      1/1     Running   0             31m
open5gs-mongodb-76d8dfbbdb-fzzqk                  1/1     Running   0             31m
open5gs-nrf-5f58b84585-wlbs6                      1/1     Running   0             31m
open5gs-nssf-8676d9bc7b-9qj28                     1/1     Running   0             31m
open5gs-pcf-655ff4967f-pvc5g                      1/1     Running   4 (27m ago)   31m
open5gs-populate-578dd7d4d8-sc9zg                 1/1     Running   0             31m
open5gs-smf-7f864d7d99-g4vhc                      1/1     Running   0             31m
open5gs-udm-7cf88b7c58-zmdj5                      1/1     Running   0             31m
open5gs-udr-6c96f5d447-9jkfk                      1/1     Running   5 (28m ago)   31m
open5gs-upf-fcc6fcd5c-tzd8x                       1/1     Running   0             31m
open5gs-webui-567c65bc7b-n566b                    1/1     Running   0             31m
ueransim-gnb-7d8867667f-4d42f                     1/1     Running   0             25m
ueransim-gnb-ues-6487c85db9-5fhcv                 1/1     Running   0             25m

Here are the logs from different CNFs

UE-RANSIM gNB

kubectl logs deployment/ueransim-gnb -f
N2_BIND_IP: 10.0.0.20
N3_BIND_IP: 10.0.0.20
RADIO_BIND_IP: 10.0.0.20
AMF_IP: 10.102.143.93
Launching gnb: nr-gnb -c gnb.yaml
UERANSIM v3.2.6
[2023-01-21 21:32:40.760] [sctp] [info] Trying to establish SCTP connection... (10.102.143.93:38412)
[2023-01-21 21:32:41.236] [rrc] [debug] UE[1] new signal detected
[2023-01-21 21:32:41.240] [rrc] [debug] UE[2] new signal detected
[2023-01-21 21:36:03.518] [sctp] [error] Connecting to 10.102.143.93:38412 failed. SCTP could not connect: Operation timed out

SMF

kubectl logs deployment/open5gs-smf -f
Open5GS daemon v2.4.11

01/21 21:29:13.000: [app] INFO: Configuration: '/opt/open5gs/etc/open5gs/smf.yaml' (../lib/app/ogs-init.c:126)
01/21 21:29:13.040: [metrics] INFO: Prometheus mhd_server() [0.0.0.0]:9090 (../lib/metrics/prometheus/context.c:320)
01/21 21:29:13.040: [smf] WARNING: No diameter configuration (../src/smf/fd-path.c:30)
01/21 21:29:13.040: [gtp] INFO: gtp_server() [10.0.1.62]:2123 (../lib/gtp/path.c:30)
01/21 21:29:13.040: [gtp] INFO: gtp_server() [10.0.1.62]:2152 (../lib/gtp/path.c:30)
01/21 21:29:13.040: [pfcp] INFO: pfcp_server() [10.0.1.62]:8805 (../lib/pfcp/path.c:30)
01/21 21:29:13.040: [pfcp] INFO: ogs_pfcp_connect() [10.110.42.201]:8805 (../lib/pfcp/path.c:61)
01/21 21:29:13.041: [sbi] INFO: NF Service [nsmf-pdusession] (../lib/sbi/context.c:1400)
01/21 21:29:13.041: [sbi] INFO: nghttp2_server() [10.0.1.62]:7777 (../lib/sbi/nghttp2-server.c:150)
01/21 21:29:13.042: [app] INFO: SMF initialize...done (../src/smf/app.c:31)
01/21 21:29:20.557: [pfcp] WARNING: [1] LOCAL  No Reponse. Give up! for step 1 type 5 peer [10.110.42.201]:8805 (../lib/pfcp/xact.c:613)
01/21 21:29:24.044: [sbi] ERROR: Connection timer expired (../lib/sbi/client.c:495)
01/21 21:29:24.044: [smf] WARNING: client_cb() failed [-3] (../src/smf/sbi-path.c:55)
01/21 21:29:24.044: [smf] WARNING: Retry to association with peer [10.110.42.201]:8805 failed (../src/smf/pfcp-sm.c:107)
01/21 21:29:24.045: [sbi] WARNING: [a647a002-99d2-41ed-a156-4373f5fe4ca2] Retry to registration with NRF (../lib/sbi/nf-sm.c:182)
01/21 21:29:24.046: [sbi] INFO: [a647a002-99d2-41ed-a156-4373f5fe4ca2] NF registered [Heartbeat:10s] (../lib/sbi/nf-sm.c:222)
01/21 21:29:31.436: [sbi] INFO: [6fdb9438-99d2-41ed-9b16-431c81f1ba2f] (NRF-notify) NF registered (../lib/sbi/nnrf-handler.c:628)
01/21 21:29:31.436: [sbi] INFO: [6fdb9438-99d2-41ed-9b16-431c81f1ba2f] (NRF-notify) NF Profile updated (../lib/sbi/nnrf-handler.c:638)
01/21 21:29:31.553: [pfcp] WARNING: [2] LOCAL  No Reponse. Give up! for step 1 type 5 peer [10.110.42.201]:8805 (../lib/pfcp/xact.c:613)
01/21 21:29:35.046: [smf] WARNING: Retry to association with peer [10.110.42.201]:8805 failed (../src/smf/pfcp-sm.c:107)
01/21 21:29:42.558: [pfcp] WARNING: [3] LOCAL  No Reponse. Give up! for step 1 type 5 peer [10.110.42.201]:8805 (../lib/pfcp/xact.c:613)
01/21 21:29:46.053: [smf] WARNING: Retry to association with peer [10.110.42.201]:8805 failed (../src/smf/pfcp-sm.c:107)
01/21 21:29:53.557: [pfcp] WARNING: [4] LOCAL  No Reponse. Give up! for step 1 type 5 peer [10.110.42.201]:8805 (../lib/pfcp/xact.c:613)
01/21 21:29:57.056: [smf] WARNING: Retry to association with peer [10.110.42.201]:8805 failed (../src/smf/pfcp-sm.c:107)
01/21 21:30:04.563: [pfcp] WARNING: [5] LOCAL  No Reponse. Give up! for step 1 type 5 peer [10.110.42.201]:8805 (../lib/pfcp/xact.c:613)
01/21 21:30:08.059: [smf] WARNING: Retry to association with peer [10.110.42.201]:8805 failed (../src/smf/pfcp-sm.c:107)
01/21 21:30:15.579: [pfcp] WARNING: [6] LOCAL  No Reponse. Give up! for step 1 type 5 peer [10.110.42.201]:8805 (../lib/pfcp/xact.c:613)
01/21 21:30:19.063: [smf] WARNING: Retry to association with peer [10.110.42.201]:8805 failed (../src/smf/pfcp-sm.c:107)
01/21 21:30:26.577: [pfcp] WARNING: [7] LOCAL  No Reponse. Give up! for step 1 type 5 peer [10.110.42.201]:8805 (../lib/pfcp/xact.c:613)
01/21 21:30:30.068: [smf] WARNING: Retry to association with peer [10.110.42.201]:8805 failed (../src/smf/pfcp-sm.c:107)
01/21 21:30:37.580: [pfcp] WARNING: [8] LOCAL  No Reponse. Give up! for step 1 type 5 peer [10.110.42.201]:8805 (../lib/pfcp/xact.c:613)
01/21 21:30:41.069: [smf] WARNING: Retry to association with peer [10.110.42.201]:8805 failed (../src/smf/pfcp-sm.c:107)
01/21 21:30:48.577: [pfcp] WARNING: [9] LOCAL  No Reponse. Give up! for step 1 type 5 peer [10.110.42.201]:8805 (../lib/pfcp/xact.c:613)
01/21 21:30:52.071: [smf] WARNING: Retry to association with peer [10.110.42.201]:8805 failed (../src/smf/pfcp-sm.c:107)
01/21 21:30:59.578: [pfcp] WARNING: [10] LOCAL  No Reponse. Give up! for step 1 type 5 peer [10.110.42.201]:8805 (../lib/pfcp/xact.c:613)
01/21 21:31:03.097: [smf] WARNING: Retry to association with peer [10.110.42.201]:8805 failed (../src/smf/pfcp-sm.c:107)
01/21 21:31:10.647: [pfcp] WARNING: [11] LOCAL  No Reponse. Give up! for step 1 type 5 peer [10.110.42.201]:8805 (../lib/pfcp/xact.c:613)
01/21 21:31:12.837: [pfcp] INFO: ogs_pfcp_connect() [10.0.1.114]:8805 (../lib/pfcp/path.c:61)
01/21 21:31:12.838: [smf] INFO: PFCP associated [10.0.1.114]:8805 (../src/smf/pfcp-sm.c:174)
01/21 21:31:14.108: [smf] WARNING: Retry to association with peer [10.110.42.201]:8805 failed (../src/smf/pfcp-sm.c:107)
01/21 21:31:14.110: [smf] INFO: PFCP associated [10.110.42.201]:8805 (../src/smf/pfcp-sm.c:174)
01/21 21:31:15.171: [sbi] INFO: [ef12ed46-99d2-41ed-97eb-1595bb41cf90] (NRF-notify) NF registered (../lib/sbi/nnrf-handler.c:628)
01/21 21:31:15.172: [sbi] INFO: [ef12ed46-99d2-41ed-97eb-1595bb41cf90] (NRF-notify) NF Profile updated (../lib/sbi/nnrf-handler.c:638)
01/21 21:31:23.840: [pfcp] ERROR: invalid step[0] type[2] (../lib/pfcp/xact.c:432)
01/21 21:31:23.840: [pfcp] ERROR: ogs_pfcp_xact_update_rx() failed (../lib/pfcp/xact.c:708)
01/21 21:31:26.343: [pfcp] ERROR: invalid step[0] type[2] (../lib/pfcp/xact.c:432)
01/21 21:31:26.343: [pfcp] ERROR: ogs_pfcp_xact_update_rx() failed (../lib/pfcp/xact.c:708)
01/21 21:31:28.876: [pfcp] ERROR: invalid step[0] type[2] (../lib/pfcp/xact.c:432)
01/21 21:31:28.876: [pfcp] ERROR: ogs_pfcp_xact_update_rx() failed (../lib/pfcp/xact.c:708)
01/21 21:31:31.376: [pfcp] WARNING: [13] LOCAL  No Reponse. Give up! for step 1 type 1 peer [10.0.1.114]:8805 (../lib/pfcp/xact.c:613)
01/21 21:31:31.376: [smf] WARNING: No Heartbeat from UPF [10.0.1.114]:8805 (../src/smf/pfcp-sm.c:316)
01/21 21:31:31.376: [smf] INFO: PFCP de-associated [10.0.1.114]:8805 (../src/smf/pfcp-sm.c:181)

AMF

kubectl logs deployment/open5gs-amf -f
Open5GS daemon v2.4.11

01/21 21:27:42.971: [app] INFO: Configuration: '/opt/open5gs/etc/open5gs/amf.yaml' (../lib/app/ogs-init.c:126)
01/21 21:27:42.983: [metrics] INFO: Prometheus mhd_server() [0.0.0.0]:9090 (../lib/metrics/prometheus/context.c:320)
01/21 21:27:42.984: [sbi] INFO: NF Service [namf-comm] (../lib/sbi/context.c:1400)
01/21 21:27:42.984: [sbi] INFO: nghttp2_server() [10.0.1.206]:7777 (../lib/sbi/nghttp2-server.c:150)
01/21 21:27:43.002: [amf] INFO: ngap_server() [10.0.1.206]:38412 (../src/amf/ngap-sctp.c:61)
01/21 21:27:43.002: [sctp] INFO: AMF initialize...done (../src/amf/app.c:33)
01/21 21:27:53.985: [sbi] ERROR: Connection timer expired (../lib/sbi/client.c:495)
01/21 21:27:53.985: [amf] WARNING: client_cb() failed [-3] (../src/amf/sbi-path.c:56)
01/21 21:27:53.985: [sbi] WARNING: [709b5c00-99d2-41ed-a70a-c14fd2ad81ed] Retry to registration with NRF (../lib/sbi/nf-sm.c:182)
01/21 21:28:04.997: [sbi] ERROR: Connection timer expired (../lib/sbi/client.c:495)
01/21 21:28:04.997: [amf] WARNING: client_cb() failed [-3] (../src/amf/sbi-path.c:56)
01/21 21:28:04.997: [sbi] WARNING: [709b5c00-99d2-41ed-a70a-c14fd2ad81ed] Retry to registration with NRF (../lib/sbi/nf-sm.c:182)
01/21 21:28:16.001: [sbi] ERROR: Connection timer expired (../lib/sbi/client.c:495)
01/21 21:28:16.001: [amf] WARNING: client_cb() failed [-3] (../src/amf/sbi-path.c:56)
01/21 21:28:16.001: [sbi] WARNING: [709b5c00-99d2-41ed-a70a-c14fd2ad81ed] Retry to registration with NRF (../lib/sbi/nf-sm.c:182)
01/21 21:28:27.002: [sbi] ERROR: Connection timer expired (../lib/sbi/client.c:495)
01/21 21:28:27.002: [amf] WARNING: client_cb() failed [-3] (../src/amf/sbi-path.c:56)
01/21 21:28:27.002: [sbi] WARNING: [709b5c00-99d2-41ed-a70a-c14fd2ad81ed] Retry to registration with NRF (../lib/sbi/nf-sm.c:182)
01/21 21:28:38.002: [sbi] ERROR: Connection timer expired (../lib/sbi/client.c:495)
01/21 21:28:38.002: [amf] WARNING: client_cb() failed [-3] (../src/amf/sbi-path.c:56)
01/21 21:28:38.002: [sbi] WARNING: [709b5c00-99d2-41ed-a70a-c14fd2ad81ed] Retry to registration with NRF (../lib/sbi/nf-sm.c:182)
01/21 21:28:49.003: [sbi] ERROR: Connection timer expired (../lib/sbi/client.c:495)
01/21 21:28:49.003: [amf] WARNING: client_cb() failed [-3] (../src/amf/sbi-path.c:56)
01/21 21:28:49.003: [sbi] WARNING: [709b5c00-99d2-41ed-a70a-c14fd2ad81ed] Retry to registration with NRF (../lib/sbi/nf-sm.c:182)
01/21 21:29:00.010: [sbi] ERROR: Connection timer expired (../lib/sbi/client.c:495)
01/21 21:29:00.010: [amf] WARNING: client_cb() failed [-3] (../src/amf/sbi-path.c:56)
01/21 21:29:00.010: [sbi] WARNING: [709b5c00-99d2-41ed-a70a-c14fd2ad81ed] Retry to registration with NRF (../lib/sbi/nf-sm.c:182)
01/21 21:29:11.012: [sbi] ERROR: Connection timer expired (../lib/sbi/client.c:495)
01/21 21:29:11.012: [amf] WARNING: client_cb() failed [-3] (../src/amf/sbi-path.c:56)
01/21 21:29:11.012: [sbi] WARNING: [709b5c00-99d2-41ed-a70a-c14fd2ad81ed] Retry to registration with NRF (../lib/sbi/nf-sm.c:182)
01/21 21:29:22.062: [sbi] ERROR: Connection timer expired (../lib/sbi/client.c:495)
01/21 21:29:22.062: [amf] WARNING: client_cb() failed [-3] (../src/amf/sbi-path.c:56)
01/21 21:29:22.063: [sbi] WARNING: [709b5c00-99d2-41ed-a70a-c14fd2ad81ed] Retry to registration with NRF (../lib/sbi/nf-sm.c:182)
01/21 21:29:22.065: [sbi] INFO: [709b5c00-99d2-41ed-a70a-c14fd2ad81ed] NF registered [Heartbeat:10s] (../lib/sbi/nf-sm.c:222)
01/21 21:29:22.367: [sbi] INFO: [abd79bc6-99d2-41ed-bb81-b73c88955538] (NRF-notify) NF registered (../lib/sbi/nnrf-handler.c:628)
01/21 21:29:22.367: [sbi] INFO: [abd79bc6-99d2-41ed-bb81-b73c88955538] (NRF-notify) NF Profile updated (../lib/sbi/nnrf-handler.c:638)
01/21 21:29:24.046: [sbi] INFO: [a647a002-99d2-41ed-a156-4373f5fe4ca2] (NRF-notify) NF registered (../lib/sbi/nnrf-handler.c:628)
01/21 21:29:24.046: [sbi] INFO: [a647a002-99d2-41ed-a156-4373f5fe4ca2] (NRF-notify) NF Profile updated (../lib/sbi/nnrf-handler.c:638)
01/21 21:29:31.326: [sbi] INFO: [aa9e70fe-99d2-41ed-a9a7-c532f02dcd23] (NRF-notify) NF registered (../lib/sbi/nnrf-handler.c:628)
01/21 21:29:31.326: [sbi] INFO: [aa9e70fe-99d2-41ed-a9a7-c532f02dcd23] (NRF-notify) NF Profile updated (../lib/sbi/nnrf-handler.c:638)
01/21 21:29:31.434: [sbi] INFO: [6fdb9438-99d2-41ed-9b16-431c81f1ba2f] (NRF-notify) NF registered (../lib/sbi/nnrf-handler.c:628)
01/21 21:29:31.434: [sbi] INFO: [6fdb9438-99d2-41ed-9b16-431c81f1ba2f] (NRF-notify) NF Profile updated (../lib/sbi/nnrf-handler.c:638)
01/21 21:29:31.435: [sbi] WARNING: [6fdb9438-99d2-41ed-9b16-431c81f1ba2f] (NRF-notify) NF has already been added (../lib/sbi/nnrf-handler.c:632)
01/21 21:29:31.435: [sbi] INFO: [6fdb9438-99d2-41ed-9b16-431c81f1ba2f] (NRF-notify) NF Profile updated (../lib/sbi/nnrf-handler.c:638)
01/21 21:29:31.435: [sbi] WARNING: NF EndPoint updated [10.0.0.117:80] (../lib/sbi/context.c:1572)
01/21 21:29:31.435: [sbi] WARNING: NF EndPoint updated [10.0.0.117:7777] (../lib/sbi/context.c:1481)
01/21 21:31:15.170: [sbi] INFO: [ef12ed46-99d2-41ed-97eb-1595bb41cf90] (NRF-notify) NF registered (../lib/sbi/nnrf-handler.c:628)
01/21 21:31:15.170: [sbi] INFO: [ef12ed46-99d2-41ed-97eb-1595bb41cf90] (NRF-notify) NF Profile updated (../lib/sbi/nnrf-handler.c:638)

Also, I cant see the uesimtun0and uesimtun1interfaces

kubectl exec deployment/ueransim-gnb-ues -ti -- bash

bash-5.1# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
33: eth0@if34: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 3a:6e:b0:fb:88:93 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.0.1.121/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::386e:b0ff:fefb:8893/64 scope link 
       valid_lft forever preferred_lft forever

Logs from the ueransim-gnb-ues-6487c85db9-5fhcv Pod

k logs ueransim-gnb-ues-6487c85db9-5fhcv
GNB_IP: 10.0.0.20
Launching ue: nr-ue -c ue.yaml
UERANSIM v3.2.6
[2023-01-21 21:32:40.898] [999700000000002|nas] [info] UE switches to state [MM-DEREGISTERED/PLMN-SEARCH]
[2023-01-21 21:32:40.899] [999700000000001|nas] [info] UE switches to state [MM-DEREGISTERED/PLMN-SEARCH]
[2023-01-21 21:32:40.900] [999700000000001|rrc] [debug] New signal detected for cell[1], total [1] cells in coverage
[2023-01-21 21:32:40.902] [999700000000002|rrc] [debug] New signal detected for cell[1], total [1] cells in coverage
[2023-01-21 21:32:40.905] [999700000000001|nas] [info] Selected plmn[999/70]
[2023-01-21 21:32:40.905] [999700000000001|rrc] [warning] Suitable cell selection failed in [1] cells. [0] out of PLMN, [0] no SI, [0] reserved, [1] barred, ftai [0]
[2023-01-21 21:32:40.906] [999700000000001|rrc] [warning] Acceptable cell selection failed in [1] cells. [0] no SI, [0] reserved, [1] barred, ftai [0]
[2023-01-21 21:32:40.906] [999700000000001|rrc] [error] Cell selection failure, no suitable or acceptable cell found
[2023-01-21 21:32:40.906] [999700000000002|nas] [info] Selected plmn[999/70]
[2023-01-21 21:32:40.907] [999700000000002|rrc] [warning] Suitable cell selection failed in [1] cells. [0] out of PLMN, [0] no SI, [0] reserved, [1] barred, ftai [0]
[2023-01-21 21:32:40.907] [999700000000002|rrc] [warning] Acceptable cell selection failed in [1] cells. [0] no SI, [0] reserved, [1] barred, ftai [0]
[2023-01-21 21:32:40.907] [999700000000002|rrc] [error] Cell selection failure, no suitable or acceptable cell found
[2023-01-21 21:32:46.409] [999700000000001|nas] [info] UE switches to state [MM-DEREGISTERED/NO-CELL-AVAILABLE]
[2023-01-21 21:32:46.409] [999700000000002|nas] [info] UE switches to state [MM-DEREGISTERED/NO-CELL-AVAILABLE]
[2023-01-21 21:33:10.941] [999700000000001|rrc] [warning] Suitable cell selection failed in [1] cells. [0] out of PLMN, [0] no SI, [0] reserved, [1] barred, ftai [0]
[2023-01-21 21:33:10.941] [999700000000001|rrc] [warning] Acceptable cell selection failed in [1] cells. [0] no SI, [0] reserved, [1] barred, ftai [0]
[2023-01-21 21:33:10.941] [999700000000001|rrc] [error] Cell selection failure, no suitable or acceptable cell found
mmarquez999 commented 1 year ago

Hello Sam!

I just tried to replicate the same deployment by following the tutorial you mentioned. It is working perfectly for me: I tried it in our Kubernetes cluster as well as in a local kind cluster.

Can you please give us further details about your Kubernetes environment so we can try to help you??

sam-sre commented 1 year ago

Hi @mmarquez999

Our lab testing environment consists of : 2 Vagrant boxes (1 Control plane, 1 Worker), both are (generic/ubuntu2004) deployed inside a large VM (Ubunto).. Ansible is automating the whole deployment via kubeadm + Helm + Cilium CNI (SCTP enabled)..

Kernel Version Both Vagrant Boxes are Linux kube1 5.4.0-135-generic #152-Ubuntu SMP Wed Nov 23 20:19:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Kubernetes Version

Client Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.0", GitCommit:"b46a3f887ca979b1a5d14fd39cb1af43e7e5d12d", GitTreeState:"clean", BuildDate:"2022-12-09T16:23:44Z", GoVersion:"go1.19.4", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.5", GitCommit:"804d6167111f6858541cef440ccc53887fbbc96a", GitTreeState:"clean", BuildDate:"2022-12-08T10:08:09Z", GoVersion:"go1.19.4", Compiler:"gc", Platform:"linux/amd64"}

cilium version

cilium-cli: v0.12.12 compiled with go1.19.4 on linux/amd64
cilium image (default): v1.12.5
cilium image (stable): v1.12.5
cilium image (running): v1.12.5
avrodriguezgrad commented 1 year ago

Hi @sam-sre

Last time I searched about cilium, I found that it does not support SCTP but, thanks to your issue, I can now see that it is a beta feature (https://docs.cilium.io/en/latest/configuration/sctp/#sctp). I don't know if its restrictions affect the connectivity of the deployment. Here you can find more information about it: https://github.com/cilium/cilium/issues/20490

If you can change the CNI and test the deployment, we can be sure that it is Cilium's fault or it is another problem.

sam-sre commented 1 year ago

Hi @avrodriguezgrad

Correct, Cilium didnt support SCTP before. I used Cilium version: 1.13.0-rc3 for my setup and enabled SCTP.. Can you give it a try with Cilium? I'll try it out with Calico and post the results here..

sam-sre commented 1 year ago

Confirming it worked with Calico.. Although Cilium is preferable to investigate security and observability options ..

Waiting for your outcome @avrodriguezgrad ^^

avrodriguezgrad commented 1 year ago

Hi @sam-sre

I could check with Cilium and I could make it work. I followed this tutorial to deploy a kind cluster with Cilium and the version you told me (https://www.bookstack.cn/read/cilium-1.12-en/5be6a00e6ed03350.md), deployed the charts with the tutorial we have and, in the following screenshot, you can see my deployment.

image

FYI, image

I don't know if I can help you in something else.

sam-sre commented 1 year ago

Hi @avrodriguezgrad

Interesting! Did you have to pass the --set sctp.enabled=true variable to helm? or did you follow exaclty steps mentioned on your link here

avrodriguezgrad commented 1 year ago

Hi @sam-sre

Yes, I followed exactly the steps of the link I mentioned but, also, I passed the SCTP variable to Helm.

sam-sre commented 1 year ago

Hi @avrodriguezgrad

I deployed a KinD environment and followed the exact same steps. It didnt work. I think it is something related to the cgroup versions.. The below commands gave same value which they should'nt

sudo ls -al /proc/$(docker inspect -f '{{.State.Pid}}' kind-control-plane)/ns/cgroup
sudo ls -al /proc/self/ns/cgroup

I'll investigate more but it is definitely not on the application side .. so thanks very much for the help..