srsran / srsRAN_Project

Open source O-RAN 5G CU/DU solution from Software Radio Systems (SRS) https://docs.srsran.com/projects/project
https://www.srsran.com
GNU Affero General Public License v3.0
477 stars 163 forks source link

Unable to detect SIBs using Foxconn RPQN 7801E RU #192

Closed lhms1234 closed 9 months ago

lhms1234 commented 1 year ago

Issue Description

I'm testing srsRAN's split 7.2 feature with a Foxconn RPQN 7801E RU. Following the tutorial, I was able to test the setup with successfully synchronization of RU and CU/DU, however, I couldn't detect any SIB message using Amarisoft UE. I was wondering if it could be some configuration issue, I hope someone could give some help at this point.

Setup Details

Expected Behavior

Amarisoft UE can detect and connect to the network and make some traffic tests (by now, just a simple ping).

Actual Behaviour

Amarisoft UE wasn't able to detect the network when RU was synchronized and the srsRAB CU/DU was running.

ptp4l output:

ptp4l[709810.897]: selected /dev/ptp1 as PTP clock
ptp4l[709810.941]: port 1 (enp1s0f1): INITIALIZING to LISTENING on INIT_COMPLETE
ptp4l[709810.941]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on INIT_COMPLETE
ptp4l[709810.941]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on INIT_COMPLETE
ptp4l[709811.071]: port 1 (enp1s0f1): new foreign master 000580.fffe.082b89-13
ptp4l[709811.337]: selected best master clock 000580.fffe.082b89
ptp4l[709811.337]: port 1 (enp1s0f1): LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[709812.488]: port 1 (enp1s0f1): minimum delay request interval 2^-4
ptp4l[709812.813]: port 1 (enp1s0f1): UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
ptp4l[709813.586]: rms 205104 max 410213 freq -27804 +/-  18 delay   308 +/-   1
ptp4l[709814.711]: rms   11 max   20 freq -27791 +/-   6 delay   309 +/-   1
ptp4l[709815.836]: rms    5 max    9 freq -27793 +/-   6 delay   309 +/-   0
ptp4l[709816.961]: rms    4 max    8 freq -27792 +/-   6 delay   309 +/-   0
ptp4l[709818.086]: rms    7 max   24 freq -27795 +/-  12 delay   309 +/-   1
ptp4l[709819.211]: rms    5 max    8 freq -27794 +/-   8 delay   308 +/-   1
ptp4l[709820.336]: rms    5 max   15 freq -27797 +/-   9 delay   308 +/-   1
ptp4l[709821.461]: rms    3 max    8 freq -27793 +/-   6 delay   308 +/-   0
ptp4l[709822.586]: rms    4 max   11 freq -27793 +/-   7 delay   308 +/-   1

phc2sys output:

phc2sys[709813.024]: CLOCK_REALTIME phc offset   -513281 s0 freq   -3295 delay    519
phc2sys[709813.149]: CLOCK_REALTIME phc offset   -513295 s1 freq   -3406 delay    517
phc2sys[709813.275]: CLOCK_REALTIME phc offset       -17 s2 freq   -3423 delay    516
phc2sys[709813.400]: CLOCK_REALTIME phc offset       -24 s2 freq   -3435 delay    514
phc2sys[709813.525]: CLOCK_REALTIME phc offset       -35 s2 freq   -3454 delay    523
phc2sys[709813.650]: CLOCK_REALTIME phc offset       -50 s2 freq   -3479 delay    525
phc2sys[709813.776]: CLOCK_REALTIME phc offset       -42 s2 freq   -3486 delay    514
phc2sys[709813.901]: CLOCK_REALTIME phc offset       -45 s2 freq   -3502 delay    507
phc2sys[709814.026]: CLOCK_REALTIME phc offset       -39 s2 freq   -3509 delay    506
phc2sys[709814.151]: CLOCK_REALTIME phc offset       -34 s2 freq   -3516 delay    500
phc2sys[709814.277]: CLOCK_REALTIME phc offset       -36 s2 freq   -3528 delay    510
phc2sys[709814.402]: CLOCK_REALTIME phc offset       -30 s2 freq   -3533 delay    504
phc2sys[709814.527]: CLOCK_REALTIME phc offset       -24 s2 freq   -3536 delay    504
phc2sys[709814.652]: CLOCK_REALTIME phc offset       -15 s2 freq   -3534 delay    507
phc2sys[709814.778]: CLOCK_REALTIME phc offset       -13 s2 freq   -3537 delay    509
phc2sys[709814.903]: CLOCK_REALTIME phc offset       -12 s2 freq   -3540 delay    506
phc2sys[709815.028]: CLOCK_REALTIME phc offset        -7 s2 freq   -3538 delay    505
phc2sys[709815.153]: CLOCK_REALTIME phc offset         1 s2 freq   -3532 delay    501
phc2sys[709815.279]: CLOCK_REALTIME phc offset         4 s2 freq   -3529 delay    505
phc2sys[709815.404]: CLOCK_REALTIME phc offset         8 s2 freq   -3524 delay    505
phc2sys[709815.529]: CLOCK_REALTIME phc offset        13 s2 freq   -3516 delay    510
phc2sys[709815.654]: CLOCK_REALTIME phc offset        14 s2 freq   -3511 delay    502
phc2sys[709815.779]: CLOCK_REALTIME phc offset        16 s2 freq   -3505 delay    498
phc2sys[709815.905]: CLOCK_REALTIME phc offset        18 s2 freq   -3498 delay    513
phc2sys[709816.030]: CLOCK_REALTIME phc offset        18 s2 freq   -3493 delay    514
phc2sys[709816.155]: CLOCK_REALTIME phc offset        18 s2 freq   -3488 delay    511
phc2sys[709816.280]: CLOCK_REALTIME phc offset        16 s2 freq   -3484 delay    515
phc2sys[709816.405]: CLOCK_REALTIME phc offset        16 s2 freq   -3479 delay    514
phc2sys[709816.530]: CLOCK_REALTIME phc offset         6 s2 freq   -3485 delay    519
phc2sys[709816.655]: CLOCK_REALTIME phc offset         6 s2 freq   -3483 delay    514
phc2sys[709816.780]: CLOCK_REALTIME phc offset         3 s2 freq   -3484 delay    511
phc2sys[709816.905]: CLOCK_REALTIME phc offset         3 s2 freq   -3483 delay    514
phc2sys[709817.031]: CLOCK_REALTIME phc offset         7 s2 freq   -3478 delay    517
phc2sys[709817.156]: CLOCK_REALTIME phc offset        -2 s2 freq   -3485 delay    517
phc2sys[709817.281]: CLOCK_REALTIME phc offset        -5 s2 freq   -3489 delay    517
phc2sys[709817.406]: CLOCK_REALTIME phc offset       -11 s2 freq   -3496 delay    513
phc2sys[709817.531]: CLOCK_REALTIME phc offset       -13 s2 freq   -3502 delay    509
phc2sys[709817.656]: CLOCK_REALTIME phc offset        -9 s2 freq   -3501 delay    514
phc2sys[709817.781]: CLOCK_REALTIME phc offset        -9 s2 freq   -3504 delay    516
phc2sys[709817.906]: CLOCK_REALTIME phc offset        -1 s2 freq   -3499 delay    514
phc2sys[709818.031]: CLOCK_REALTIME phc offset        -5 s2 freq   -3503 delay    522
phc2sys[709818.156]: CLOCK_REALTIME phc offset        -8 s2 freq   -3508 delay    515
phc2sys[709818.282]: CLOCK_REALTIME phc offset        -4 s2 freq   -3506 delay    514
phc2sys[709818.407]: CLOCK_REALTIME phc offset        -7 s2 freq   -3510 delay    517
phc2sys[709818.532]: CLOCK_REALTIME phc offset        -8 s2 freq   -3513 delay    520
phc2sys[709818.657]: CLOCK_REALTIME phc offset        -2 s2 freq   -3510 delay    505
phc2sys[709818.782]: CLOCK_REALTIME phc offset         1 s2 freq   -3507 delay    518
phc2sys[709818.907]: CLOCK_REALTIME phc offset         5 s2 freq   -3503 delay    513
phc2sys[709819.032]: CLOCK_REALTIME phc offset        -3 s2 freq   -3510 delay    517
phc2sys[709819.157]: CLOCK_REALTIME phc offset        -1 s2 freq   -3508 delay    515
phc2sys[709819.282]: CLOCK_REALTIME phc offset         3 s2 freq   -3505 delay    513
phc2sys[709819.408]: CLOCK_REALTIME phc offset         3 s2 freq   -3504 delay    515
phc2sys[709819.533]: CLOCK_REALTIME phc offset         5 s2 freq   -3501 delay    512
phc2sys[709819.658]: CLOCK_REALTIME phc offset         1 s2 freq   -3503 delay    520
phc2sys[709819.783]: CLOCK_REALTIME phc offset         3 s2 freq   -3501 delay    519
phc2sys[709819.908]: CLOCK_REALTIME phc offset        -1 s2 freq   -3504 delay    519
phc2sys[709820.033]: CLOCK_REALTIME phc offset        -3 s2 freq   -3507 delay    520
phc2sys[709820.158]: CLOCK_REALTIME phc offset        -4 s2 freq   -3508 delay    527
phc2sys[709820.283]: CLOCK_REALTIME phc offset        -6 s2 freq   -3512 delay    517
phc2sys[709820.408]: CLOCK_REALTIME phc offset        -2 s2 freq   -3509 delay    515
phc2sys[709820.534]: CLOCK_REALTIME phc offset         1 s2 freq   -3507 delay    520
phc2sys[709820.659]: CLOCK_REALTIME phc offset        -1 s2 freq   -3509 delay    520
phc2sys[709820.784]: CLOCK_REALTIME phc offset        -4 s2 freq   -3512 delay    519
phc2sys[709820.909]: CLOCK_REALTIME phc offset        -3 s2 freq   -3512 delay    520
phc2sys[709821.034]: CLOCK_REALTIME phc offset         1 s2 freq   -3509 delay    516
phc2sys[709821.159]: CLOCK_REALTIME phc offset         3 s2 freq   -3507 delay    519
phc2sys[709821.284]: CLOCK_REALTIME phc offset         5 s2 freq   -3504 delay    512
phc2sys[709821.409]: CLOCK_REALTIME phc offset         6 s2 freq   -3501 delay    522
phc2sys[709821.534]: CLOCK_REALTIME phc offset         3 s2 freq   -3503 delay    519
phc2sys[709821.660]: CLOCK_REALTIME phc offset         3 s2 freq   -3502 delay    520
phc2sys[709821.785]: CLOCK_REALTIME phc offset         6 s2 freq   -3498 delay    515
phc2sys[709821.910]: CLOCK_REALTIME phc offset         6 s2 freq   -3496 delay    515
phc2sys[709822.035]: CLOCK_REALTIME phc offset        -3 s2 freq   -3503 delay    527
phc2sys[709822.160]: CLOCK_REALTIME phc offset         1 s2 freq   -3500 delay    513
phc2sys[709822.285]: CLOCK_REALTIME phc offset        -1 s2 freq   -3502 delay    521
^Cphc2sys[709822.403]: CLOCK_REALTIME phc offset         3 s2 freq   -3498 delay    515

srsRAN output:

Could not set the affinity for the ru_rx_0 worker

--== srsRAN gNB (commit 1afd7240f) ==--

Connecting to AMF on 10.20.1.20:38412
Initializing Open Fronthaul Interface sector=0: ul_comp=[BFP,9], dl_comp=[BFP,9], prach_comp=[BFP,9] prach_cp_enabled=true, downlink_broadcast=false.
Cell pci=12, bw=100 MHz, dl_arfcn=649980 (n78), dl_freq=3749.7 MHz, dl_ssb_arfcn=647232, ul_freq=3749.7 MHz

==== gNodeB started ===
Type <t> to view trace
^CStopping ..

srsRAN log (snippet repeated several times):

2023-08-29T16:35:09.569514 [OFH     ] [W] Detected late downlink request in slot=277.0_0, current ota_slot=276.18_13, processing time takes symbols=17
2023-08-29T16:35:09.569515 [OFH     ] [W] Dropping downlink resource grid at slot=277.0 and sector=0 as it arrived late

Steps to reproduce the problem

  1. Prepare the setup as described in the tutorial, but with a Foxconn RPQN 7801E RU instead of a Benetel RAN550;
  2. Run ptp4l (sudo ./ptp4l -2 -i enp1s0f1 -f ./default.cfg -m);
  3. Run phc2sys (sudo ./phc2sys -s enp1s0f1 -w -m -R 8 -f ./default.cfg);
  4. Run srsRAN (sudo gnb -c gnb_ru_rpqn4800e_tdd_n78_20mhz.yml);
  5. Run Amarisoft UE.

Additional Information

Here are the configuration files used: RRHconfig_xran.xml (RU configuration), default.cfg (linuxptp configuration), and gnb_ru_rpqn4800e_tdd_n78_20mhz.yml (srsRAN configuration).

andrepuschmann commented 1 year ago

Hey @lhms1234 - could you verify the RU is actually emitting a signal? If so, does the spectrum look fine?

lhms1234 commented 1 year ago

Hi @andrepuschmann. Sure. Yesterday I've checked it doesn't emitting any signal.

andrepuschmann commented 1 year ago

Looking at the config again it I suggest you set only on UL antenna and 4 DL antennas, i.e.

  nof_antennas_dl: 4
  nof_antennas_ul: 1

Also maybe try to play with the iq_scaling param and set it to 0.6. Those are just guess, I need to assume that you're cabling, etc is correct and that the config values, etc. match the RU/DU. Please double check this as well. Can you provide a PCAP of the OFH traffic? Has this HW setup been tested before with another DU and is confirmed to be working fine?

lhms1234 commented 1 year ago

About the antenna configuration, I've configured it this way because of 2 fields of the RRHconfig_xran.xml file (RU's configuration file), which are "RRH_EN_EAXC_ID" and "RRH_TRX_EN_BIT_MASK". They are correlated as follows:

<!-- RRH_EN_EAXC_ID: Enable using eAxC ID field defined in O-RAN spec.                            -->
<!--                 When 0 is set, RU port ID=0,1,2,3 are used for PDSCH/PUSCH if RRH_TRX_EN_BIT_MASK = 0x0F -->
<!--                 When 0 is set, RU port ID=4,5,6,7 are used for PRACH       if RRH_TRX_EN_BIT_MASK = 0x0F -->
<!--                 When 0 is set, RU port ID=0,1     are used for PDSCH/PUSCH if RRH_TRX_EN_BIT_MASK = 0x03 -->
<!--                 When 0 is set, RU port ID=2,3     are used for PRACH       if RRH_TRX_EN_BIT_MASK = 0x03 -->
RRH_EN_EAXC_ID = 0

Thanks for the tip, here is the pcap. Only the Core and the PTP Grandmaster were successfully tested before with another RAN stack. The machine that hosts srsRAN CU/DU wasn't tested this way before (with PTP Grandmaster and RU). After changing iq_scaling many times, nothing changes. I've noted on the Wireshark that all the U-Plane packets are malformed.

lhms1234 commented 1 year ago

Hi @andrepuschmann. After some tests, it was discovered that the network card used in the machine described above apparently doesn't support PTP. But the problem found happens before this. Testing srsRAN in another setup (with another RU of the same model), that was completely tested and works well with another RAN stack, the same error has occurred: no signal was emitted. When the other stack is exercised without PTP, at least the SIBs are detected. Could you provide the RU configuration file (RRHconfig_xran.xml) used on your integration tests with rpqn4800e RU, please?

ismagom commented 1 year ago

Hi @lhms1234 , I think you should use different window parameters and iq_scaling, can you try with this config?

ru_ofh:
  ru_bandwidth_MHz: 100
  t1a_max_cp_dl: 470
  t1a_min_cp_dl: 258
  t1a_max_cp_ul: 429
  t1a_min_cp_ul: 285
  t1a_max_up: 196
  t1a_min_up: 50
  is_prach_cp_enabled: true
  is_dl_broadcast_enabled: false
  ignore_ecpri_payload_size: true
  compr_method_ul: bfp
  compr_bitwidth_ul: 9
  compr_method_dl: bfp
  compr_bitwidth_dl: 9
  compr_method_prach: bfp
  compr_bitwidth_prach: 9
  enable_ul_static_compr_hdr: false
  enable_dl_static_compr_hdr: false
  iq_scaling: 1.6
  enable_dl_parallelization: true
  cells:
    - network_interface: eno8603np3
      ru_mac_addr: 6c:ad:ad:00:08:c4
      du_mac_addr: b4:45:06:ec:7f:b2
      vlan_tag: 2
      prach_port_id: [4, 5, 6, 7]
      dl_port_id: [0, 1, 2, 3]
      ul_port_id: [0, 1, 2, 3]

cell_cfg:
  dl_arfcn: 640000                                              # ARFCN of the downlink carrier (center frequency).
  band: 78                                                      # The NR band.
  channel_bandwidth_MHz: 100                                     # Bandwith in MHz. Number of PRBs will be automatically derived.
  common_scs: 30                                                # Subcarrier spacing in kHz used for data.
  plmn: "00101"                                                 # PLMN broadcasted by the gNB.
  tac: 7                                                        # Tracking area code (needs to match the core configuration).
  pci: 1
  nof_antennas_dl: 4
  nof_antennas_ul: 1
  prach:
    prach_config_index: 159
    prach_root_sequence_index: 1
    zero_correlation_zone: 0
    prach_frequency_start: 12
  pdsch:
    mcs_table: qam256
  tdd_ul_dl_cfg:
    dl_ul_tx_period: 10
    nof_dl_slots: 7
    nof_dl_symbols: 6
    nof_ul_slots: 2
    nof_ul_symbols: 0

expert_phy:
  nof_ul_threads: 4
  nof_pdsch_threads: 4
  nof_dl_threads: 4
  max_proc_delay: 4
lhms1234 commented 1 year ago

Hi @ismagom,

Sorry about the delay in replying. I was testing your configuration and found many synchronization problems on my setup. Now I'm able to connect to the network and generate traffic with a COTS UE (Tablet Samsung Galaxy Tab S8 Plus), but I've noticed three problems:

ismagom commented 1 year ago

Thanks for the feedback @lhms1234 , could you attach logs please?

lhms1234 commented 1 year ago

Sure, here you are:

ismagom commented 1 year ago

Can you try the following:

1) Add these options to the config file:

expert_phy:
  nof_ul_threads: 3
  nof_pdsch_threads: 4
  nof_dl_threads: 2
  max_proc_delay: 5

2) Generate DL UDP traffic from the machine running the core (instead of TCP)

and send again gnb.log and gnb_terminal.log please. Thanks

lhms1234 commented 1 year ago
ismagom commented 1 year ago

You still have many RT faults. It's not easy to tune a system to do 100 MHz 4x4 MIMO. Maybe you can start with lower BW and number of antennas?

lhms1234 commented 11 months ago

Hi @ismagom. Sorry for the late response. After many tests using an 80 MHz BW with MIMO 2x1, I've reached a throughput of ~200 Mbps DL and 25 Mbps UL. On my tests with MIMO 4x4, the throughput was considerably lower (both DL and UL), but MIMO 2x1 was more stable. On my tests with 100 MHz BW (including with MIMO 2x1), the DL throughput was close to 25 Mbps and 1-5 Mbps for UL. I'm attaching the logs and the configuration file of a test with MIMO 2x1 and 80 MHz BW here . I really appreciate it if you could guide me to make some configuration optimizations (on the srsRAN side) to obtain a better performance using MIMO 4x4 or, at least, a 100 MHz BW.

ismagom commented 11 months ago

Hi @lhms1234, can you try if this branch works better? For this to work you need to comment out the entire expert_phy section. I would also recommend commenting the parameters min_ue_mcs for pdsch and pusch

ismagom commented 10 months ago

Hi @lhms1234 , can you give us an update on the status of this issue, please?

lhms1234 commented 10 months ago

Hi @ismagom. Of course. Unfortunately, I've made no more progress on my tests for some setup sharing issues. After a NIC that we bought arrives, I'll prepare a new setup and resume the tests.

andrepuschmann commented 10 months ago

Hey @lhms1234 - could you perhaps share the FW version running on your RPQN-7801E? We are currently trying a RPQN-7801I and experience issues with unusually high BLER as well. With DPD enabled in the FW it doesn't even work at all at bandwidth larger than 20 MHz.

lhms1234 commented 10 months ago

Hi @andrepuschmann. Sure, here you are:

root@arria10:~/test# cat /home/root/test/version.txt
b_branch: master
b_commit: ffa6dd30bdfd581dab5b3ac31e1933533d22bf50
s_commit: aa0fbb0efca175b6d281398d9bd0f03423bfb693
tag: v3.1.12q.551
build_time: 202304071650
andrepuschmann commented 10 months ago

Hey @lhms1234 - your tests with 80 MHz 2x1 look better than our tests actually. But there definitely is an RF issue as well as the 20% BLER isn't normal. At least with our RQPN-4800 we see very little BLER unless we are far away of course or have other impairments. Have you changed anything in the RU config? Is the one posted in the initial post still the same? Specifically, have you still RRH_RF_GENERAL_CTRL = 0x3, 0x1, 0x0, 0x0 this in your config?

When you test with lower bandwidth, do you change the RRH_MAX_PRB = 273 setting? Or you just leave it there and start with e.g. 80 MHz in the srsRAN config?

lhms1234 commented 10 months ago

Hi @andrepuschmann. I made some changes, but not at the RRH_RF_GENERAL_CTRL. The RRH_MAX_PRB is one of the parameters that I've changed, making it have the same value as the configured on the srsRAN config file. I'm attaching the current configuration of RU and srsRAN for MIMO 2x1 with 80 MHz BW.

ismagom commented 10 months ago

Hi @lhms1234,

Can you send gnb.log and console trace again in the MIMO 2x1 configuration?

Also, make sure you check out the latest code. From your config above it looks you are using one before 23.10.

I would also suggest to not use min_ue_mcs or max_ue_mcs please.

Thanks

lhms1234 commented 10 months ago

Hi @ismagom.

Maybe I made a mistake, it's been some time since my last test, as I mentioned previously. Yesterday the NIC we bought arrived, I'm going to resume testing with the newest version of srsRAN as soon as possible.

ismagom commented 9 months ago

Is there any update on this issue?

lhms1234 commented 9 months ago

Hello, @ismagom. Yes, I have some good news. Using a newer repository version on the new setup made it possible to achieve better throughputs. I've noticed the following mean and peak values:

The commit I compiled is "55c984b55736d0dd2d2ee328f1ae8d9de97e3e19", with these configurations of RU and gNB. The setup (considering srsRAN version) proved to be much more stable and more performatic with MIMO 4x4, SCS 30 KHz and 100MHz BW on n78 band (using Foxconn RPQN 7801E). image

ismagom commented 9 months ago

Thanks for the report. There is certainly room to improve the performance, if you are interested in proceeding we can certainly help. Please create a new Discussion post if that's the case.

Thanks for all the feedback