EttusResearch / uhd

The USRP™ Hardware Driver Repository
http://uhd.ettus.com
Other
942 stars 644 forks source link

USRP X440 with X4_1600: Correlated Real/Imag values at Fc=2e9, Fs=1e9 #743

Closed mmatthebi closed 1 month ago

mmatthebi commented 2 months ago

Issue Description

I have experienced a strange behaviour in a particular sample rate setting of the USRP X440. Using the embedded system, loading the X4_1600 image, setting the master clock rate to 1GHz and the center frequency to 2GHz, I experienced a strong correlation between the real and imaginary part of a received signal. When connecting TX0 with RX0 over a cable with attenuator, this resulted in a constellation and eye diagrams like these:

image

image

What I transmitted was a RC-filtered waveform and I sampled at the correct RX times. The slope of the line on the left is different whenever the device is rebooted or the master clock rate had been changed in between the measurements (e.g. 1GHz -> 1.244GHz -> 1GHz).

When using fc = 1GHz, the constellation looks fine (ignore the unequalized phase, there is no RX equalization done): image

I have investigated this further and could reproduce a similar behaviour using the rfnoc_rx_to_file.cpp source. The example does not rely on a transmitted signal, but merely the received noise shows the questionable behaviour.

Setup Details

Use USRP X440 with mender image UHD v4.6.0.0, flash X4_1600 image onto the FPGA. Run all code on the embedded system. No specific RF connections needed.

# wget https://raw.githubusercontent.com/EttusResearch/uhd/master/host/examples/rfnoc_rx_to_file.cpp
# g++ rfnoc_rx_to_file.cpp -o rfnoc_rx_to_file -luhd -lboost_program_options
# ./rfnoc_rx_to_file --args=addr=localhost,master_clock_rate=1e9 --nsamps 4096 --rate 1e9 --freq 2e9 --gain 0 --file fs1e9_fc2e9.dat

Creating the RFNoC graph with args: addr=localhost,master_clock_rate=1e9
[INFO] [UHD] linux; GNU C++ version 9.2.0; Boost_107100; UHD_4.6.0.0-0-g50fa3baa
[INFO] [MPMD] Initializing 1 device(s) in parallel with args: gmt_addr=127.0.0.1,type=x4xx,product=x440,serial=32896F6,name=NE-LAB-X440-01,fpga=X4_1600,claimed=False,addr=localhost,master_clock_rate=1e9
[WARNING] [MPM.RPCServer] A timeout event occured!
[INFO] [MPM.PeriphManager] init() called with device args `fpga=X4_1600,master_clock_rate=(1000000000.0, 1000000000.0),mgmt_addr=127.0.0.1,name=NE-LAB-X440-01,product=x440,clock_source=internal,time_source=internal,initializing=True'.
Using radio 0, channel 0
Requesting RX Freq: 2000 MHz...
Actual RX Freq: 2000 MHz...

Requesting RX Gain: 0 dB...
Actual RX Gain: 0 dB...

Waiting for "lo_locked": ++++++++++ locked.

Using streamer args:
Active connections:
* 0/Radio#0:0-->RxStreamer#0:0
Requesting RX Rate: 1000 Msps...
Setting rate on radio block!
Actual RX Rate: 1000 Msps...

Issuing stream cmd
Issuing stop stream cmd

Done!

# ./rfnoc_rx_to_file --args=addr=localhost,master_clock_rate=1e9 --nsamps 4096 --rate 1e9 --freq 1e9 --gain 0 --file fs1e9_fc1e9.dat

Creating the RFNoC graph with args: addr=localhost,master_clock_rate=1e9
[INFO] [UHD] linux; GNU C++ version 9.2.0; Boost_107100; UHD_4.6.0.0-0-g50fa3baa
[INFO] [MPMD] Initializing 1 device(s) in parallel with args: mgmt_addr=127.0.0.1,type=x4xx,product=x440,serial=32896F6,name=NE-LAB-X440-01,fpga=X4_1600,claimed=False,addr=localhost,master_clock_rate=1e9
[INFO] [MPM.PeriphManager] init() called with device args `fpga=X4_1600,master_clock_rate=(1000000000.0, 1000000000.0),mgmt_addr=127.0.0.1,name=NE-LAB-X440-01,product=x440,clock_source=internal,time_source=internal,initializing=True'.
[...]
Done!

This way, two files are created: fs1e9_fc1e9.dat and fs1e9_fc2e9.dat Use below analysis script to show a very basic analysis of the files. What is done is:

  1. Load the samples, convert to complex int16
  2. divide real / imaginary
  3. print the number of positive and negative values in the quotient array
import numpy as np
import sys

X = np.fromfile(sys.argv[1], dtype=np.int16)
R = X[::2]
I = X[1::2]

D = R / I
D = D.reshape(-1, 4)

np.set_printoptions(precision=2, threshold=99999999, linewidth=100)
print(D)

print("Positive ratios: ", np.sum(D > 0))
print("Negative Ratios: ", np.sum(D < 0))

When running the script for the two generated files, I get

$ python printDat.py fs1e9_fc1e9.dat
[....]
 [-15.     2.    -0.33  -2.75]
 [ -0.77  -0.    -0.2   -1.5 ]
 [ -0.44    inf  -0.4    1.  ]
 [  2.67   0.14  -0.33   0.2 ]]
Positive ratios:  1929
Negative Ratios:  1903
$ python printDat.py fs1e9_fc2e9.dat
 [ 1.    0.43  0.33  0.25]
 [ 0.33  0.43  0.4   0.  ]
 [ 0.4   0.5   0.47  0.5 ]
 [ 0.5   0.    0.43  0.43]
 [ 0.6   0.5   0.44  0.5 ]
 [ 0.4   0.    1.    0.44]
 [ 0.43  1.    0.46  0.4 ]
 [ 0.43  0.    0.43  0.44]
 [ 0.5   0.46  0.41  0.67]
 [ 0.42  0.44  0.45  1.  ]
 [ 0.42  0.45  0.5  -0.  ]
 [  nan  0.5   0.29  0.5 ]]
Positive ratios:  3623
Negative Ratios:  7

As seen, when fc=1e9 (or any other frequency far away from 2e9), the ratios are equally often positive and negative, indicating there is just white noise on top. Instead, when analyzing with carrier freuqency 2e9, in the above case almost all ratios are positive, which indicates there is a strong correlation between TX and RX values. Also, plotting the values shows this correlation:

fc=2e9: image

fc=1e9: image

Expected Behavior

I expect the behaviour to be consistent regardless of the carrier frequency.

Actual Behaviour

see above

Steps to reproduce the problem

see above

Additional Information

I attach the two generated test files to this issue. data.zip

manderseck commented 1 month ago

When running X440 at a MCR of 1 GHz it'll use a converter rate of 4 GHz. This means that there is a Nyquist zone boundary at 2 GHz which is exactly what you're measuring. The measurements exactly at the boundary cannot be trusted. I would expect to have a similar behavior everytime you hit the Nyquist zone boundary, no matter which MCR and therefore which converter rate you choose. If you choose a different MCR that translates to a different converter rate and measure at 2 GHz again I'd expect the results to be fine. Can you confirm this?

mmatthebi commented 1 month ago

Thanks for your comment! Indeed, hitting the zone boundary produces the described results. I have actually simulated the behaviour using the filter coefficients provided by Xilinx and I could reproduce exactly the results I describe above. Therefore, this is not a real bug.

However, maybe it might make sense to issue a warning on the console that one runs exactly at the boundary and that measurements are not supported there? Maybe there could even be a hard error?

manderseck commented 1 month ago

@mmatthebi Thanks for confirming and thanks for your proposal to warn in such cases. Throwing a warning would be possible, I think. I'll check if that's a low-effort thing to do.