Open andrepuschmann opened 3 years ago
Here is another segfault with stacktrace of the same issue I believe:
--- Software Radio Systems LTE eNodeB ---
Reading configuration file enb.conf...
Built in Release mode using commit 0967cda04 on branch dev.
Opening 2 channels in RF device=uhd with args=type=n3xx,tx_subdev_spec=A:0 B:0,rx_subdev_spec=A:0 B:0
Available RF device list: UHD soapy zmq
[INFO] [UHD] linux; GNU C++ version 9.3.0; Boost_107100; UHD_4.1.0.2-1-gceac1bdd
[INFO] [LOGGING] Fastpath logging disabled at runtime.
Opening USRP channels=2, args: type=n3xx,tx_subdev_spec=A:0 B:0,rx_subdev_spec=A:0 B:0,master_clock_rate=122.88e6
[INFO] [UHD RF] RF UHD Generic instance constructed
[INFO] [MPMD] Initializing 1 device(s) in parallel with args: mgmt_addr=192.168.20.2,type=n3xx,product=n310,serial=317F537,fpga=HG,claimed=False,addr=192.168.20.2,master_clock_rate=122.88e6
[WARNING] [MPM.RPCServer] A timeout event occured!
[INFO] [MPM.PeriphManager] init() called with device args `fpga=HG,master_clock_rate=122.88e6,mgmt_addr=192.168.20.2,product=n310,clock_source=internal,time_source=internal'.
[WARNING] [RFNOC::GRAPH] One or more blocks timed out during flush!
[INFO] [UHD RF] Setting tx_subdev_spec to 'A:0 B:0'
[INFO] [UHD RF] Setting rx_subdev_spec to 'A:0 B:0'
[INFO] [MULTI_USRP] 1) catch time transition at pps edge
[INFO] [MULTI_USRP] 2) set times next pps (synchronously)
Stack trace (most recent call last):
#19 Object "", at 0xffffffffffffffff, in
#18 Object "/home/anpu/src/srsLTE/build_release/srsenb/src/srsenb", at 0x56079edc00bd, in _start
#17 Source "../csu/libc-start.c", line 308, in __libc_start_main [0x7f9c743730b2]
#16 Object "/home/anpu/src/srsLTE/build_release/srsenb/src/srsenb", at 0x56079edbd888, in main
#15 Object "/home/anpu/src/srsLTE/build_release/srsenb/src/srsenb", at 0x56079eddc917, in srsenb::enb::init(srsenb::all_args_t const&)
#14 Object "/home/anpu/src/srsLTE/build_release/srsenb/src/srsenb", at 0x56079f208fd7, in srsran::radio::init(srsran::rf_args_t const&, srsran::phy_interface_radio*)
#13 Object "/home/anpu/src/srsLTE/build_release/srsenb/src/srsenb", at 0x56079f200391, in srsran::radio::open_dev(unsigned int const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
#12 Object "/home/anpu/src/srsLTE/build_release/lib/src/phy/rf/libsrsran_rf.so.21.10.0", at 0x7f9c74b1b1fb, in rf_uhd_open_multi
#11 Object "/home/anpu/src/srsLTE/build_release/lib/src/phy/rf/libsrsran_rf.so.21.10.0", at 0x7f9c74b19cc9, in uhd_init(rf_uhd_handler_t*, char*, unsigned int)
#10 Object "/home/anpu/src/srsLTE/build_release/lib/src/phy/rf/libsrsran_rf.so.21.10.0", at 0x7f9c74b27952, in rf_uhd_generic::get_rx_stream(unsigned long&)
#9 Object "/opt/uhd-4.1-release/lib/libuhd.so.4.1.0", at 0x7f9c739fdc57, in multi_usrp_rfnoc::get_rx_stream(uhd::stream_args_t const&)
#8 Object "/opt/uhd-4.1-release/lib/libuhd.so.4.1.0", at 0x7f9c73907475, in rfnoc_graph_impl::connect(uhd::rfnoc::block_id_t const&, unsigned long, std::shared_ptr<uhd::rx_streamer>, unsigned long, unsigned long)
#7 Object "/opt/uhd-4.1-release/lib/libuhd.so.4.1.0", at 0x7f9c738d0234, in graph_stream_manager_impl::create_device_to_host_data_stream(std::pair<unsigned short, unsigned short>, uhd::rfnoc::sw_buff_t, uhd::rfnoc::sw_buff_t, unsigned long, uhd::device_addr_t const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
#6 Object "/opt/uhd-4.1-release/lib/libuhd.so.4.1.0", at 0x7f9c738ce2f5, in link_stream_manager_impl::create_device_to_host_data_stream(std::pair<unsigned short, unsigned short>, uhd::rfnoc::sw_buff_t, uhd::rfnoc::sw_buff_t, uhd::device_addr_t const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
#5 Object "/opt/uhd-4.1-release/lib/libuhd.so.4.1.0", at 0x7f9c73d33b3e, in uhd::mpmd::mpmd_mboard_impl::mpmd_mb_iface::make_rx_data_transport(uhd::rfnoc::mgmt::mgmt_portal&, std::pair<std::pair<unsigned short, unsigned short>, std::pair<unsigned short, unsigned short> > const&, std::pair<unsigned short, unsigned short> const&, uhd::rfnoc::sw_buff_t, uhd::rfnoc::sw_buff_t, uhd::device_addr_t const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
#4 Object "/opt/uhd-4.1-release/lib/libuhd.so.4.1.0", at 0x7f9c738b8019, in uhd::rfnoc::chdr_rx_data_xport::configure_sep(std::shared_ptr<uhd::transport::io_service>, std::shared_ptr<uhd::transport::recv_link_if>, std::shared_ptr<uhd::transport::send_link_if>, uhd::rfnoc::chdr::chdr_packet_factory const&, uhd::rfnoc::mgmt::mgmt_portal&, std::pair<unsigned short, unsigned short> const&, uhd::rfnoc::sw_buff_t, uhd::rfnoc::sw_buff_t, uhd::rfnoc::stream_buff_params_t const&, uhd::rfnoc::stream_buff_params_t const&, uhd::rfnoc::stream_buff_params_t const&, bool, std::function<void ()>)
#3 Object "/opt/uhd-4.1-release/lib/libuhd.so.4.1.0", at 0x7f9c7391c0a4, in uhd::rfnoc::mgmt::mgmt_portal_impl::config_local_rx_stream_commit(uhd::rfnoc::chdr_ctrl_xport&, unsigned short const&, double, bool)
#2 Object "/opt/uhd-4.1-release/lib/libuhd.so.4.1.0", at 0x7f9c739164c6, in uhd::rfnoc::mgmt::mgmt_portal_impl::_get_ostrm_status(uhd::rfnoc::chdr_ctrl_xport&, std::vector<std::pair<uhd::rfnoc::mgmt::node_id_t, int>, std::allocator<std::pair<uhd::rfnoc::mgmt::node_id_t, int> > > const&)
#1 Object "/opt/uhd-4.1-release/lib/libuhd.so.4.1.0", at 0x7f9c73910f54, in uhd::rfnoc::mgmt::mgmt_portal_impl::_send_recv_mgmt_transaction(uhd::rfnoc::chdr_ctrl_xport&, uhd::rfnoc::chdr::mgmt_payload const&, double) [clone .constprop.0]
#0 Object "/opt/uhd-4.1-release/lib/libuhd.so.4.1.0", at 0x7f9c738ae695, in uhd::rfnoc::chdr::mgmt_payload::deserialize(unsigned long const*, unsigned long, std::function<unsigned long (unsigned long)> const&)
Segmentation fault (Address not mapped to object [0x5607ba1e9000])
Segmentation fault
I am seeing a similiar crash testing 4.2. It is in deserialize. It looks like the message length is very large.
Issue Description
The issue only appears when using the N310 in a two channel configuration. It happens occasionally but is annoying nonetheless since its causing many tests to fail because the eNB/gNB doesn't start up in the first place.
We are using the N310 to test an NSA configuration that uses 2x channels at 15.35Msps. I've compiled UHD 4.1 in debug mode and got following backtrace. Unfortunately not all symbols are there and line numbers aren't shown.
Setup Details
Expected Behavior
No UHD crash when starting eNB.
Actual Behaviour
UHD segfaults occasionally.
Steps to reproduce the problem
I've not been able to reproduce the issue with the UHD examples but the srsRAN appnote for running COTS UEs here contains all config steps. The UHD device args for the N310 are shown at the end of the document.
Note that you don't need a COTS UE or even a core network. Just starting the eNB with this config crashes the UHD every so often.
Additional Information
Let me know if you need further details or want me to compile with different flags to maybe get more debug info.