srsran / srsRAN_Project

Open source O-RAN 5G CU/DU solution from Software Radio Systems (SRS) https://docs.srsran.com/projects/project
https://www.srsran.com
GNU Affero General Public License v3.0
523 stars 178 forks source link

Docker: gNB fails to connect to Open5GS core #739

Closed dominikheinz closed 4 months ago

dominikheinz commented 4 months ago

Issue Description

I am having trouble setting up a 5G SA network with Docker. It appears that the gNB and Open5GS containers can't communicate. Starting the containers, after a couple of minutes, I get a timout:

gnb_1             | 
gnb_1             | --== srsRAN gNB (commit ) ==--
gnb_1             | 
gnb_1             | 
gnb_1             | The PRACH detector will not meet the performance requirements with the configuration {Format B4, ZCZ 0, SCS 30kHz, Rx ports 1}.
gnb_1             | Lower PHY in quad executor mode.
gnb_1             | N2: Failed to connect to AMF on 10.53.1.2:38412. error="Connection timed out" timeout=340922ms
gnb_1             | srsRAN ERROR: CU-CP failed to connect to AMF
srsran_gnb exited with code 1

Setup Details

Expected Behavior

I'd expect that the gNB would start in the container and connect to the open5gs core container.

Actual Behaviour

When starting the containers, they all start fine, but I don't see any output from the gnb container. The logs for the gnb container are empty. I'd expect that the gnb container would load my custom config and connect to core.

Steps to reproduce the problem

  1. Run docker-compose -f docker-compose-custom.yml up with the docker-compose file attached below.
  2. Observe how the gnb container is running idle, but there is no output and seemingly no connection to 5gcore

    Additional Information

docker-compose-custom.yml:

services:
  5gc:
    container_name: open5gs_5gc
    build:
      context: open5gs
      target: open5gs
      args:
        OS_VERSION: "22.04"
        OPEN5GS_VERSION: "v2.7.0"
    env_file:
      - ${OPEN_5GS_ENV_FILE:-open5gs/open5gs.env}
    privileged: true
    ports:
      - "9999:9999/tcp"
      # To expose o5gc outside the container
      - "38412:38412/sctp"
      - "2152:2152/udp"
    command: 5gc -c open5gs-5gc.yml
    healthcheck:
      test: [ "CMD-SHELL", "nc -z 127.0.0.20 7777" ]
      interval: 3s
      timeout: 1s
      retries: 60
    networks:
      ran:
        ipv4_address: ${OPEN5GS_IP:-10.53.1.2}

  gnb:
    container_name: srsran_gnb
    image: srsran/gnb
    build:
      context: ..
      dockerfile: docker/Dockerfile
      args:
        OS_VERSION: "24.04"
    privileged: true
    cap_add:
      - SYS_NICE
      - CAP_SYS_PTRACE
    volumes:
      - /dev/bus/usb/:/dev/bus/usb/
      - /usr/share/uhd/images:/usr/share/uhd/images
      - gnb-storage:/tmp
      - ./gnb_custom.yml:/gnb_config.yml:ro
      # - ../configs/gnb_rf_b200_tdd_n78_20mhz.yml:/gnb_config.yml:ro
    networks:
      ran:
        ipv4_address: ${GNB_IP:-10.53.1.3}
      metrics:
        ipv4_address: 172.19.1.3
    depends_on:
      5gc:
        condition: service_healthy
    # command: gnb -c /gnb_config.yml amf --addr ${OPEN5GS_IP:-10.53.1.2} --bind_addr ${GNB_IP:-10.53.1.3}
    command: gnb -c /gnb_config.yml log --all_level debug amf --addr ${OPEN5GS_IP:-10.53.1.2} --bind_addr ${GNB_IP:-10.53.1.3}

  metrics-server:
    container_name: metrics_server
    image: srsran/metrics_server
    build:
      context: metrics_server
    environment:
      - PORT=${METRICS_SERVER_PORT}
      - BUCKET=${DOCKER_INFLUXDB_INIT_BUCKET}
      - TESTBED=default
      - URL=http://${DOCKER_INFLUXDB_INIT_HOST}:${DOCKER_INFLUXDB_INIT_PORT}
      - ORG=${DOCKER_INFLUXDB_INIT_ORG}
      - TOKEN=${DOCKER_INFLUXDB_INIT_ADMIN_TOKEN}
    ports:
      - 55555:${METRICS_SERVER_PORT}/udp
    networks:
      metrics:
        ipv4_address: 172.19.1.4

  influxdb:
    container_name: influxdb
    image: influxdb:${DOCKER_INFLUXDB_VERSION}
    volumes:
      - influxdb-storage:/var/lib/influxdb2:rw
    env_file:
      - .env
    restart: on-failure:10
    networks:
      metrics:
        ipv4_address: 172.19.1.5

  grafana:
    container_name: grafana
    image: srsran/grafana
    build:
      context: grafana
    volumes:
      - grafana-storage:/var/lib/grafana:rw
    env_file:
      - .env
    depends_on:
      - influxdb
      - metrics-server
    ports:
      - 3300:${GRAFANA_PORT}
    networks:
      metrics:
        ipv4_address: 172.19.1.6

volumes:
  gnb-storage:
  grafana-storage:
  influxdb-storage:
  gnb-config:

networks:
  ran:
    ipam:
      driver: default
      config:
        - subnet: 10.53.1.0/24
  metrics:
    ipam:
      driver: default
      config:
        - subnet: 172.19.1.0/24

The gnb_custom.yml (My custom gnb config):

# amf:
  # addr: 127.0.0.5                                               # The address or hostname of the AMF.
  # bind_addr: 127.0.0.1                                          # A local IP that the gNB binds to for traffic from the AMF.

amf:
  addr: 127.0.1.100                                               
  bind_addr: 127.0.0.1

ru_sdr:
  device_driver: uhd                                            # The RF driver name.
  device_args: type=b200,num_recv_frames=64,num_send_frames=64  # Optionally pass arguments to the selected RF driver.
  sync: internal                                                # Specify the sync source used by the RF. NOTE: Set to internal if NOT using an external 10 MHz reference clock. 
  srate: 23.04                                                  # RF sample rate might need to be adjusted according to selected bandwidth.
  otw_format: sc12
  tx_gain: 80                                                   # Transmit gain of the RF might need to adjusted to the given situation.
  rx_gain: 40                                                   # Receive gain of the RF might need to adjusted to the given situation.

cell_cfg:
  dl_arfcn: 627340                                              # ARFCN of the downlink carrier (center frequency).
  band: 78                                                      # The NR band.
  channel_bandwidth_MHz: 20                                     # Bandwith in MHz. Number of PRBs will be automatically derived.
  common_scs: 30                                                # Subcarrier spacing in kHz used for data.
  plmn: "99970"                                                 # PLMN broadcasted by the gNB.
  # plmn: "00101"
  tac: 7                                                        # Tracking area code (needs to match the core configuration).
  pci: 1                                                        # Physical cell ID.

log:
  filename: /tmp/gnb.log                                         
  all_level: warning

pcap:
  mac_enable: false                                        
  mac_filename: /tmp/gnb_mac.pcap                          
  ngap_enable: false                                       
  ngap_filename: /tmp/gnb_ngap.pcap  

I assume it is some sort of network misconfiguration?

pgawlowicz commented 4 months ago

i think is should be:

amf:
  addr: 10.53.1.2                                               
  bind_addr: 10.53.1.3
dominikheinz commented 4 months ago

@pgawlowicz Hmm, I don't see any difference. Same behaviour. Do I need to rebuild the containers? If so, how do I do that? Trying to rebuild the gnb container with docker-compose -f docker-compose-custom.yml build doesn't work:

─❯ docker-compose -f docker-compose-custom.yml build  
influxdb uses an image, skipping
Building 5gc
DEPRECATED: The legacy builder is deprecated and will be removed in a future release.
            Install the buildx component to build images with BuildKit:
            https://docs.docker.com/go/buildx/

Sending build context to Docker daemon  31.23kB
Step 1/21 : ARG OS_VERSION=22.04
Step 2/21 : FROM ubuntu:$OS_VERSION AS base
 ---> 8a3cdc4d1ad3
Step 3/21 : ENV PYTHONBUFFERED=1
 ---> Using cache
 ---> c3bb6bc601a7
Step 4/21 : ENV DEBIAN_FRONTEND=noninteractive
 ---> Using cache
 ---> 178b01807f0d
Step 5/21 : RUN DEBIAN_FRONTEND=noninteractive apt-get update     && apt install -y software-properties-common     && rm -rf /var/lib/apt/lists/*
 ---> Using cache
 ---> 331d33bfecfd
Step 6/21 : RUN DEBIAN_FRONTEND=noninteractive apt-get update     && apt-get install -y     python3-pip     python3-setuptools     python3-wheel     ninja-build     build-essential     flex     bison     git     libsctp-dev     libgnutls28-dev     libgcrypt-dev     libssl-dev     libidn11-dev     libmongoc-dev     libbson-dev     libyaml-dev     libnghttp2-dev     libmicrohttpd-dev     libcurl4-gnutls-dev     libnghttp2-dev     libtins-dev     meson     curl     gettext     gdb     iproute2     iptables     iputils-ping     netcat-openbsd     iperf     iperf3     libtalloc-dev     cmake     && rm -rf /var/lib/apt/lists/*
 ---> Using cache
 ---> 967bb48e1f91
Step 7/21 : ARG MONGO_MAJOR_VERSION=6
 ---> Using cache
 ---> f21dc54f8201
Step 8/21 : RUN DEBIAN_FRONTEND=noninteractive apt-get update && apt-get install -y --no-install-recommends wget gnupg     && wget -qO - https://www.mongodb.org/static/pgp/server-${MONGO_MAJOR_VERSION}.0.asc | apt-key add     && . /etc/os-release     && echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu $UBUNTU_CODENAME/mongodb-org/${MONGO_MAJOR_VERSION}.0 multiverse" | tee /etc/apt/sources.list.d/mongodb-org-${MONGO_MAJOR_VERSION}.0.list     && DEBIAN_FRONTEND=noninteractive apt-get update && apt-get install -y --no-install-recommends mongodb-org     && apt-get autoremove && apt-get clean
 ---> Using cache
 ---> 531769f592cd
Step 9/21 : ARG OPEN5GS_VERSION=v2.7.0
 ---> Using cache
 ---> 04d57295e460
Step 10/21 : RUN echo $OPEN5GS_VERSION > ./open5gsversion
 ---> Using cache
 ---> 4ab8fd417d4e
Step 11/21 : ARG NUM_CORES=""
 ---> Using cache
 ---> 78cd7d636bc6
Step 12/21 : RUN if [ -z "$NUM_CORES" ]; then NUM_CORES=$(nproc); fi &&     git clone --depth 1 --branch $(cat ./open5gsversion) https://github.com/open5gs/open5gs open5gs    && cd open5gs     && meson build --prefix=`pwd`/install     && ninja -j ${NUM_CORES} -C build     && cd build     && ninja install
 ---> Using cache
 ---> ac62d7723853
Step 13/21 : ARG NODE_MAJOR=20
 ---> Using cache
 ---> 120a3f2ca048
Step 14/21 : RUN curl -fsSL https://deb.nodesource.com/setup_${NODE_MAJOR}.x | bash -     && apt-get install -y nodejs     && cd open5gs/webui     && npm ci --no-optional
 ---> Using cache
 ---> dfe966567f6d
Step 15/21 : RUN python3 -m pip install pymongo click pyroute2 ipaddress python-iptables
 ---> Using cache
 ---> 1b8d1b13a921
Step 16/21 : FROM base AS open5gs
 ---> 1b8d1b13a921
Step 17/21 : WORKDIR /open5gs
 ---> Using cache
 ---> 3da36cfaa772
Step 18/21 : COPY open5gs-5gc.yml open5gs-5gc.yml.in
 ---> Using cache
 ---> 2002e5e264d1
Step 19/21 : COPY open5gs_entrypoint.sh add_users.py setup_tun.py subscriber_db.cs[v] ./
 ---> Using cache
 ---> 68e6113744e8
Step 20/21 : ENV PATH="${PATH}:/open5gs/build/tests/app/"
 ---> Using cache
 ---> dd9b4bad0217
Step 21/21 : ENTRYPOINT [ "./open5gs_entrypoint.sh" ]
 ---> Using cache
 ---> c5e6575029cd
Successfully built c5e6575029cd
Successfully tagged srsran_5gc:latest
Building gnb
DEPRECATED: The legacy builder is deprecated and will be removed in a future release.
            Install the buildx component to build images with BuildKit:
            https://docs.docker.com/go/buildx/

unable to prepare context: unable to evaluate symlinks in Dockerfile path: lstat /home/fiveglab/ADWISOR5G/software/docker: no such file or directory
ERROR: Service 'gnb' failed to build : Build failed

Aside from rebuilding, what else could be the cause of this issue?

pgawlowicz commented 4 months ago

simply remove containers and images and rebuild

dominikheinz commented 4 months ago

@pgawlowicz Thanks for the reply. There are multiple problems. Yes, I can remove the containers and image's and re-run docker-compose up --build Just now, this gave me this error:

[ 97%] Linking CXX executable srscu
during GIMPLE pass: cfg
In file included from /src/include/srsran/support/srsran_assert.h:27,
                 from /src/include/srsran/adt/optional.h:25,
                 from /src/include/srsran/mac/bsr_format.h:29,
                 from /src/include/srsran/du/du_cell_config.h:25,
                 from /src/include/srsran/du_high/du_high_configuration.h:4,
                 from /src/include/srsran/du/du_high_wrapper_config.h:25,
                 from /src/include/srsran/du/du_high_wrapper_factory.h:26,
                 from /src/lib/du/du_high_wrapper_factory.cpp:23:
/src/external/fmt/include/fmt/format.h: In member function 'constexpr void fmt::v7::detail::numeric_specs_checker<ErrorHandler>::check_sign() [with ErrorHandler = fmt::v7::detail::dynamic_specs_handler<fmt::v7::basic_format_parse_context<char> >]':
/src/external/fmt/include/fmt/format.h:2151:22: internal compiler error: Segmentation fault
 2151 |   FMT_CONSTEXPR void check_sign() {
      |                      ^~~~~~~~~~
0x1c3925a internal_error(char const*, ...)
    ???:0
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <file:///usr/share/doc/gcc-13/README.Bugs> for instructions.
make[2]: *** [lib/du/CMakeFiles/srsran_du_high_wrapper.dir/build.make:76: lib/du/CMakeFiles/srsran_du_high_wrapper.dir/du_high_wrapper_factory.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:4681: lib/du/CMakeFiles/srsran_du_high_wrapper.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[ 97%] Built target srscu
make: *** [Makefile:146: all] Error 2
The command '/bin/sh -c if [ -z "$NUM_CORES" ]; then NUM_CORES=$(nproc); fi &&     LIB_UPPER=$(echo $LIB | tr '[:lower:]' '[:upper:]') &&     export ${LIB_UPPER}_DIR="/opt/${LIB}/${LIB_VERSION}" &&     /src/docker/scripts/builder.sh     -m -j${NUM_CORES}     -DBUILD_TESTS=False     -DENABLE_${LIB_UPPER}=On     -DCMAKE_CXX_FLAGS="-march=${ARCH}"     ${EXTRA_CMAKE_ARGS} /src' returned a non-zero code: 2
ERROR: Service 'gnb' failed to build : Build failed

Secondly, I still have the problem that even the default docker setup fails to load the configurations files. You can replicate this just by cloning the repo, and running docker-compose up. You'll notice that gnb fails to load the config:

srsran_gnb        | /gnb_config.yml was not readable (missing?)
srsran_gnb        | Run with --help for more information.
srsran_gnb        | 
srsran_gnb        | --== srsRAN gNB (commit ) ==--
srsran_gnb        | 
srsran_gnb exited with code 103

Can you provide me some clarifications for the following:

N2WU commented 4 months ago

Cleanly remove all active containers with: docker system prune -a -f

I believe your configuration file is referenced incorrectly. You need to be more explicit. For instance, if you put it in ~/srsRAN_Project/configs/gnb_config.yml, you'd need to reference it from docker/Dockerfile as ../configs/gnb_config.yml.

Those seg fault errors are weird, I'd just use git pull to redownload srsRAN.

dominikheinz commented 4 months ago

@N2WU Yea, seems like removing all containers and rebuilding did the trick. And you are correct, it appears that it was an incorrect relative path problem.

The segfaults still happen every now and then when I rebuild, not entirely sure why. Usually, if it segfaults during build, just starting the rebuild again works... no idea why, but it does.