srsran / srsRAN_4G

Open source SDR 4G software suite from Software Radio Systems (SRS) https://docs.srsran.com/projects/4g
https://www.srsran.com
GNU Affero General Public License v3.0
3.43k stars 1.13k forks source link

srsUE can't maintain a connection to cell for more than a few seconds #966

Open smunaut opened 2 years ago

smunaut commented 2 years ago

Issue Description

I run srsUE 22.04.01 ( ce8a3cae171f08c9bce83ae3611e56f2d168d073 ) and I can't get the connection to be stable / last for more than a few seconds.

Setup Details

Code running on Ubuntu 20.04 on a Ryzen 3700X. Process started with nice -20 priority, scheduling governor set to 'performance'. USRP B205mini running latest 3.15 LTS UHD (tried latest 4.x, same results. Also tried a B210 with same results. Tried different USB3 cables, different USB ports and also running the B210 on external power).

Remote cell is a RBS6402 that's right next to the USRP with Open5GS stack. The whole network side is running on other machines.

The options I changed compared to the example configs :

I also have a ping continuously running that should keep the channel active.

Expected Behavior

Ideally it should connect to the cell and keep the connection active indefinitely.

Actual Behaviour

See logs below.

It always finds the cell and initially connects very quickly and reliably, but then fails almost as quickly and never recovers.

Steps to reproduce the problem

See above

Additional Information

Well the log files can be hundreds of Megabytes so I can search for something specific if you point to what I should be looking for .

Heres a typical output :

Active RF plugins: libsrsran_rf_uhd.so libsrsran_rf_zmq.so
Inactive RF plugins: 
Reading configuration file ue-local.conf...

Built in Release mode using commit ce8a3cae1 on branch master.

Opening 1 channels in RF device=default with args=num_recv_frames=64,num_send_frames=64
Supported RF device list: UHD zmq file
Trying to open RF device 'UHD'
[INFO] [UHD] linux; GNU C++ version 9.4.0; Boost_107100; UHD_3.15.0.0-74-ge35f66e8
[INFO] [LOGGING] Fastpath logging disabled at runtime.
Opening USRP channels=1, args: num_recv_frames=64,num_send_frames=64,type=b200,master_clock_rate=23.04e6
[INFO] [UHD RF] RF UHD Generic instance constructed
[INFO] [B200] Detected Device: B205mini
[INFO] [B200] Operating over USB 3.
[INFO] [B200] Initialize CODEC control...
[INFO] [B200] Initialize Radio control...
[INFO] [B200] Performing register loopback test... 
[INFO] [B200] Register loopback test passed
[INFO] [B200] Asking for clock rate 23.040000 MHz... 
[INFO] [B200] Actually got clock rate 23.040000 MHz.
RF device 'UHD' successfully opened
Waiting PHY to initialize ... done!
Attaching UE...
.
Found Cell:  Mode=FDD, PCI=480, PRB=50, Ports=2, CP=Normal, CFO=3.1 KHz
Found PLMN:  Id=90170, TAC=2
Random Access Transmission: seq=32, tti=5934, ra-rnti=0x5
RRC Connected
Random Access Complete.     c-rnti=0x8e9, ta=1
Network attach successful. IP: 10.45.0.52
 nTp) 19/8/2022 12:15:22 TZ:128
[INFO] [UHD RF] Tx while waiting for EOB, timed out... 13.7712 >= 0. Starting new burst...
RF status: O=1, U=1, L=78
[INFO] [UHD RF] Tx while waiting for EOB, timed out... 17.8792 >= 15.7803. Starting new burst...
RF status: O=1, U=1, L=78
/home/tnt/data/lte/srsLTE/lib/src/phy/mimo/precoding.c:1221: Error predecoding CCD: Invalid combination of ports 2 and rx antennax 1
/home/tnt/data/lte/srsLTE/lib/src/phy/phch/pdsch.c:880: Error predecoding
/home/tnt/data/lte/srsLTE/lib/src/phy/mimo/precoding.c:1221: Error predecoding CCD: Invalid combination of ports 2 and rx antennax 1
/home/tnt/data/lte/srsLTE/lib/src/phy/phch/pdsch.c:880: Error predecoding
/home/tnt/data/lte/srsLTE/lib/src/phy/mimo/precoding.c:1221: Error predecoding CCD: Invalid combination of ports 2 and rx antennax 1
/home/tnt/data/lte/srsLTE/lib/src/phy/phch/pdsch.c:880: Error predecoding
/home/tnt/data/lte/srsLTE/lib/src/phy/mimo/precoding.c:1221: Error predecoding CCD: Invalid combination of ports 2 and rx antennax 1
/home/tnt/data/lte/srsLTE/lib/src/phy/phch/pdsch.c:880: Error predecoding
/home/tnt/data/lte/srsLTE/lib/src/phy/mimo/precoding.c:1221: Error predecoding CCD: Invalid combination of ports 2 and rx antennax 1
/home/tnt/data/lte/srsLTE/lib/src/phy/phch/pdsch.c:880: Error predecoding
/home/tnt/data/lte/srsLTE/lib/src/phy/mimo/precoding.c:1221: Error predecoding CCD: Invalid combination of ports 2 and rx antennax 1
/home/tnt/data/lte/srsLTE/lib/src/phy/phch/pdsch.c:880: Error predecoding
/home/tnt/data/lte/srsLTE/lib/src/phy/mimo/precoding.c:1221: Error predecoding CCD: Invalid combination of ports 2 and rx antennax 1
/home/tnt/data/lte/srsLTE/lib/src/phy/phch/pdsch.c:880: Error predecoding
/home/tnt/data/lte/srsLTE/lib/src/phy/mimo/precoding.c:1221: Error predecoding CCD: Invalid combination of ports 2 and rx antennax 1
/home/tnt/data/lte/srsLTE/lib/src/phy/phch/pdsch.c:880: Error predecoding
/home/tnt/data/lte/srsLTE/lib/src/phy/mimo/precoding.c:1221: Error predecoding CCD: Invalid combination of ports 2 and rx antennax 1
/home/tnt/data/lte/srsLTE/lib/src/phy/phch/pdsch.c:880: Error predecoding
/home/tnt/data/lte/srsLTE/lib/src/phy/mimo/precoding.c:1221: Error predecoding CCD: Invalid combination of ports 2 and rx antennax 1
/home/tnt/data/lte/srsLTE/lib/src/phy/phch/pdsch.c:880: Error predecoding
RF status: O=1, U=1, L=71
/home/tnt/data/lte/srsLTE/lib/src/phy/mimo/precoding.c:1221: Error predecoding CCD: Invalid combination of ports 2 and rx antennax 1
/home/tnt/data/lte/srsLTE/lib/src/phy/phch/pdsch.c:880: Error predecoding
/home/tnt/data/lte/srsLTE/lib/src/phy/mimo/precoding.c:1221: Error predecoding CCD: Invalid combination of ports 2 and rx antennax 1
/home/tnt/data/lte/srsLTE/lib/src/phy/phch/pdsch.c:880: Error predecoding
/home/tnt/data/lte/srsLTE/lib/src/phy/mimo/precoding.c:1221: Error predecoding CCD: Invalid combination of ports 2 and rx antennax 1
/home/tnt/data/lte/srsLTE/lib/src/phy/phch/pdsch.c:880: Error predecoding
/home/tnt/data/lte/srsLTE/lib/src/phy/mimo/precoding.c:1221: Error predecoding CCD: Invalid combination of ports 2 and rx antennax 1
/home/tnt/data/lte/srsLTE/lib/src/phy/phch/pdsch.c:880: Error predecoding
/home/tnt/data/lte/srsLTE/lib/src/phy/mimo/precoding.c:1221: Error predecoding CCD: Invalid combination of ports 2 and rx antennax 1
/home/tnt/data/lte/srsLTE/lib/src/phy/phch/pdsch.c:880: Error predecoding
RF status: O=1, U=2, L=83
Scheduling request failed: releasing RRC connection...
Random Access Transmission: seq=21, tti=3834, ra-rnti=0x5
Random Access Transmission: seq=20, tti=3854, ra-rnti=0x5
Random Access Transmission: seq=7, tti=3874, ra-rnti=0x5
Random Access Transmission: seq=10, tti=3894, ra-rnti=0x5
Random Access Transmission: seq=6, tti=3914, ra-rnti=0x5
Random Access Transmission: seq=29, tti=3934, ra-rnti=0x5
Random Access Transmission: seq=28, tti=3954, ra-rnti=0x5
Random Access Transmission: seq=9, tti=3974, ra-rnti=0x5
Random Access Transmission: seq=34, tti=3994, ra-rnti=0x5
Random Access Transmission: seq=10, tti=4014, ra-rnti=0x5
Warning: Detected Radio-Link Failure
.
Found Cell:  Mode=FDD, PCI=480, PRB=50, Ports=2, CP=Normal, CFO=0.1 KHz
Timer T311 expired: Going to RRC IDLE
RRC IDLE
RF status: O=2, U=0, L=0
RF status: O=1, U=0, L=0
RF status: O=1, U=0, L=0
RF status: O=1, U=0, L=0
RF status: O=1, U=0, L=0
RF status: O=1, U=0, L=0
RF status: O=2, U=0, L=0
RF status: O=1, U=0, L=0
RF status: O=1, U=0, L=0
RF status: O=1, U=0, L=0
RF status: O=1, U=0, L=0
andrepuschmann commented 2 years ago

Hey @smunaut,

thanks for the report. Do you mind sharing the full UE logs as well, and maybe also the PCAP if you have? You definitely have lates there so this might explain the disconnect. In theory it shouldn't harm, but also give it a try without anylogs (or only to warning) and without PCAP.

/home/tnt/data/lte/srsLTE/lib/src/phy/mimo/precoding.c:1221: Error predecoding CCD: Invalid combination of ports 2 and rx antennax 1 /home/tnt/data/lte/srsLTE/lib/src/phy/phch/pdsch.c:880: Error predecoding

This is probably because the eNB tries to use a higher transmission mode that requires more Rx antennas. Note that the b205 only has one and actually the LTE standard requires 2. Have a look maybe you can select the TM in the configs for the cell?

smunaut commented 1 year ago

Hi @andrepuschmann So I have just re-ran a bunch of test with the release_22_10 .

With the B205mini, it now seems to work fairly reliably. Connection works vast majority of the time, I was able to run iperf and ping for > 10 min. So I'm pretty happy with the performance there.

With the B210 however, I can't get those results.

Same exact config as with the B205mini (just unplug one and plug the other). Also tried several USB ports (on different controllers), several USB cables, tried different device_args, I tried with the B210 on external power or with USB power, nothing works.

The majority of the time, it connects fine, but then after a coupe of seconds, there is a RF status: O=0, U=1, L=0 message and then nothing. And it takes a reload of the B210 to even get it to work again at all.

The B205mini and B210 are similar devices but there are definitely some differences (B205mini is actually closer to a B200 ...), including some buffer sizes. So this could explain why the B210 sometime has a over/under-run, but still I'd expect it to recover. Not sure if the B205mini would recover since it just doesn't happen with it.

Does it work for you with a B210 ? With what UHD version ?

Here are the log of a typical failed run with the B210 :

ue-issue.zip

andrepuschmann commented 1 year ago

Mhh, good question. Yeah, I guess the underflow could be the culprit. I also noticed this in the log:

2022-12-05T15:51:10.070588 [PHY_LIB] [E] [ 6106]  Error receiving samples
2022-12-05T15:51:10.070589 [PHY    ] [E] [ 6105] SYNC:  Receiving from radio.

That really looks like the B210 is gone out of a sudden.

I noticed that the CFO of that B210 is quite high (Found Cell: Mode=FDD, PCI=480, PRB=50, Ports=2, CP=Normal, CFO=6.9 KHz). The PHY will measure and correct up to half subcarrier spacing, so 7.5kHz. However, maybe give it a try to shift it manually and see if that helps. Also compare with the b200 you have. Sometimes different batches are sooo far off from each other.

Regarding UHD, we also use 3.15 LTS in the CI. But also there its not a 100% stable either.