EttusResearch / uhd

The USRP™ Hardware Driver Repository
http://uhd.ettus.com
Other
1k stars 667 forks source link

USRP X440 with X4_1600: Overflow recording data using Replay Block #737

Open mmatthebi opened 8 months ago

mmatthebi commented 8 months ago

Issue Description

I have been playing around with the X4_1600 image of the USRP X440 using UHD v4.6.0.0. I figured out there is a strange overflow behaviour when using the Replay Block of this image when using 2GHz Master clock rate.

Using the UHD example replay_capture.py works. However, for a more complex setup the overflow occurs. In particular, the overflow happens when recording data from the radio into the replay block, when the replay block's input channel was connected to a TxStreamer beforehand. The overflow happens only at high master clock rates and only if some minimal amount of data is to be recorded.

Setup Details

Using USRP X440, UHD v4.6.0.0, no specific cable connections. Compiling and running the programs directly on the USRP X440 embedded system (same issue occurs when running the program from a different host).

See the test program code at the end of this issue. Compile the program directly on the USRP with

g++ replay_x440.cpp -o replay_x440 -luhd -lboost_system

The program has 4 parameters:

  1. the IP of the USRP (use localhost when running directly on the device)
  2. the master clock rate in Hz
  3. wether or not to connect and disconnect a TxStreamer before recording from the radio (use 1 to connect, anything else to not connect)
  4. the number of samples to record (divided by 1024)

Expected Behavior

I expect the program to run with no errors regardless the Master clock rate and if I connect a TX streamer or not (given that the amount of samples is reasonable).

Actual Behaviour

Running the program with 2GHz master clock rate, connecting+disconnecting a TX Streamer and 20*1024 samples to record, an overflow is indicated. The overflow is shown by a single "O" on the console and the message "ERROR_CODE_OVERFLOW". The overflow results in the replay block not recording the requested amount of data.

root@NE-LAB-X440-01:~# ./replay_x440 localhost 2e9 1 20
hi4.6.0.0-0-g50fa3baa
[INFO] [UHD] linux; GNU C++ version 9.2.0; Boost_107100; UHD_4.6.0.0-0-g50fa3baa
[INFO] [MPMD] Initializing 1 device(s) in parallel with args: mgmt_addr=127.0.0.1,type=x4xx,product=x440,serial=32896F6,name=NE-LAB-X440-01,fpga=X4_1600,claimed=False,addr=localhost,master_clock_rate=2e9
[INFO] [MPM.PeriphManager] init() called with device args `fpga=X4_1600,master_clock_rate=(2000000000.0, 2000000000.0),mgmt_addr=127.0.0.1,name=NE-LAB-X440-01,product=x440,clock_source=internal,time_source=internal,initializing=True'.
Connect upload
disconnect
Connect for record
[WARNING] [0/Radio#0] Attempting to set tick rate to 0. Skipping.
recording 20480
OERROR: Error at recording: ERROR_CODE_OVERFLOW (Overflow)
root@NE-LAB-X440-01:~# 

The program runs fine for different combinations of parameters. It works when either the master clock rate is smaller, fewer samples are recorded or no different rfnoc connections are done to the block.

./replay_x440 localhost 2e9 1 16  # runs fine --> fewer samples to record (65k bytes)
./replay_x440 localhost 2e9 0 20  # runs fine --> no Tx streamer connected
./replay_x440 localhost 2e9 0 200000  # runs fine --> no Tx streamer connected, tons of samples
./replay_x440 localhost 1e9 1 20  # runs fine  --> lower master clock rate

Steps to reproduce the problem

compile and run the program as described above.

Additional Information

Here's the test program code:

#include <iostream>
#include <thread>
#include <chrono>

#include <uhd/utils/graph_utils.hpp>
#include <uhd/rfnoc/block_id.hpp>
#include <uhd/rfnoc/radio_control.hpp>
#include <uhd/rfnoc/replay_block_control.hpp>
#include <uhd/rfnoc_graph.hpp>
#include <uhd/rfnoc/mb_controller.hpp>

#include <uhd/version.hpp>

using uhd::rfnoc::block_id_t;
using uhd::rfnoc::replay_block_control;
using uhd::rfnoc::radio_control;
using uhd::rfnoc::rfnoc_graph;

using namespace std::chrono_literals;

void disconnectAll(rfnoc_graph::sptr graph) {
    graph->release();
    for (auto& edge : graph->enumerate_active_connections()) {
        if (edge.dst_blockid.find("RxStreamer") != std::string::npos) {
            graph->disconnect(edge.src_blockid, edge.src_port);
        }
        else if (edge.src_blockid.find("TxStreamer") != std::string::npos) {
            graph->disconnect(edge.dst_blockid, edge.dst_port);
        }
        else {
            graph->disconnect(edge.src_blockid, edge.src_port, edge.dst_blockid, edge.dst_port);
        }
    }

    if (true) {
        graph->disconnect("TxStreamer#0", 0);
        graph->disconnect("TxStreamer#0");
    }

    graph->commit();
}

void connectUploadAndDisconnect(rfnoc_graph::sptr graph, replay_block_control::sptr replayCtrl) {
    std::cout << "Connect upload" << std::endl;
    auto txStreamer = graph->create_tx_streamer(1, uhd::stream_args_t("fc32", "sc16"));
    graph->connect(txStreamer, 0, replayCtrl->get_block_id(), 0);
    graph->commit();
    std::this_thread::sleep_for(500ms);

    // disconnect
    std::cout << "disconnect" << std::endl;
    disconnectAll(graph);
    txStreamer.reset();
    std::this_thread::sleep_for(500ms);
    graph->commit();
}

int main(int argc, char *argv[]) {
    try {
        std::cout << "hi" << uhd::get_version_string() << std::endl;

        if (argc < 5) {
            std::cout << "Usage: <program> <ip> <mcr> <connectForUpload> <num_samples>" << std::endl;
            return 1;
        }

        std::string ip = argv[1];
        std::string mcr = argv[2];
        bool doConnectForUpload = (argv[3][0] == '1');
        uint64_t NUM_SAMPLES = 1024 * std::stoi(argv[4]);

        auto graph = rfnoc_graph::make("addr="+ip+",master_clock_rate="+mcr);

        auto replayCtrl = graph->get_block<replay_block_control>(block_id_t("0/Replay#0"));
        auto radio0 = graph->get_block<radio_control>(block_id_t("0/Radio#0"));

        std::this_thread::sleep_for(500ms);

        // Upload connection
        if (doConnectForUpload)
            connectUploadAndDisconnect(graph, replayCtrl);

        // streaming connection
        std::cout << "Connect for record" << std::endl;
        // connect forward edge
        graph->connect(replayCtrl->get_block_id(), 0, radio0->get_block_id(), 0);
        graph->connect(radio0->get_block_id(), 0, replayCtrl->get_block_id(), 0, true);
        graph->commit();

        const uint64_t MEM_SIZE = replayCtrl->get_mem_size();
        replayCtrl->record(MEM_SIZE / 2, NUM_SAMPLES * 4, 0);

        double fpgaTime = graph->get_mb_controller()->get_timekeeper(0)->get_time_now().get_real_secs();

        double txRxTime = fpgaTime + 0.5;
        std::cout << "recording " << NUM_SAMPLES << std::endl;

        // record
        uhd::stream_cmd_t rxStreamCmd(uhd::stream_cmd_t::STREAM_MODE_NUM_SAMPS_AND_DONE);
        rxStreamCmd.num_samps = NUM_SAMPLES;
        rxStreamCmd.stream_now = false;
        rxStreamCmd.time_spec = uhd::time_spec_t(txRxTime);
        radio0->issue_stream_cmd(rxStreamCmd, 0);

        uhd::rx_metadata_t asyncMd;
        double timeout = 1;
        while (replayCtrl->get_record_async_metadata(asyncMd, timeout)) {
            if (asyncMd.error_code != uhd::rx_metadata_t::ERROR_CODE_NONE)
                throw std::runtime_error("Error at recording: " + asyncMd.strerror());
            timeout = 0.02;
        }

        std::cout << "Recorded bytes: " << replayCtrl->get_record_fullness(0) << std::endl;

    } catch(std::exception& e) {
        std::cerr << "ERROR: " << e.what() << std::endl;
        return 1;
    }

    return 0;
}
mmatthe commented 5 months ago

I just checked with the new v4.7.0.0-rc1 with usrp_update_fs -t v4.7.0.0-rc1. The behaviour remains the same, though now one needs a few more bytes before the overflow occurs

root@NE-LAB-X440-01:~# ./replay_x440 localhost 2e9 1 21
hi4.7.0.0-214-g327f294e
[INFO] [UHD] linux; GNU C++ version 11.4.0; Boost_107800; UHD_4.7.0.0-214-g327f294e
[INFO] [MPMD] Initializing 1 device(s) in parallel with args: mgmt_addr=127.0.0.1,type=x4xx,product=x440,serial=32896F6,name=NE-LAB-X440-01,fpga=X4_1600,claimed=False,addr=localhost,master_clock_rate=2e9
[INFO] [MPM.PeriphManager] init() called with device args `fpga=X4_1600,master_clock_rate=(2000000000.0, 2000000000.0),mgmt_addr=127.0.0.1,name=NE-LAB-X440-01,product=x440,clock_source=internal,time_source=internal,initializing=True'.
Connect upload
disconnect
Connect for record
recording 21504
OERROR: Error at recording: ERROR_CODE_OVERFLOW (Overflow)
root@NE-LAB-X440-01:~#