pavel-demin / red-pitaya-notes

Notes on the Red Pitaya Open Source Instrument
http://pavel-demin.github.io/red-pitaya-notes/
MIT License
337 stars 209 forks source link

Red Pitaya + GNU Radio | Loopback baseband/passband | corruption after few secs #832

Closed fmagno closed 5 years ago

fmagno commented 5 years ago

TL;DR: Install the official Realtek drivers: https://www.unixblogger.com/how-to-get-your-realtek-rtl8111rtl8168-working-updated-guide/

Description of the setup:

Description of the problem:

Hi Pavel,

Thanks for taking the time to look into this. Follows my question:

Running the following GNU Radio flowgraph (gr_baseband_async_oi.grc):

image

With OUT1 connected to IN1 and IN2 as depicted below:

image

And by transmitting the following input signal (sig_input.dat):

image

Then produced the following received signals generated after 10 runs (each run with a different colour):

PORT IN1:

image

PORT IN2:

image

Apparently the signals arrive correctly (ignoring the amplitude attenuation) for ~2 secs but then they start showing artefacts/corruption. A few more pictures in detail:

image image

Among other tests I have also tried a similar flowgraph with a different center_freq but similar corruption arises:

image image

I wonder if you can reproduce this behaviour and whether this is expected? Let me know if more info is needed.

Thanks in advance, fmagno

pavel-demin commented 5 years ago

Thanks for providing very detailed description of your setup and for sharing all the scripts.

I've just tried to run your grc configuration 10 times and here are the output plots showing 10 recorded signals:

I've set the sample rate to 50k and the duration to 500k. My setup has 50 Ohm terminators on both inputs.

fmagno commented 5 years ago

Thanks a lot for your quick reply! Although the 50 Ohm did not make any difference on my setup, what really made a huge difference was adding the other Red Pitaya Sink and drive it with a Constant Source with 0's - image below:

image

I even ran the experiment for a while longer and it seems to be running just fine:

image image

I am now left wondering why you weren't able to reproduce the same issue on your setup?

fmagno commented 5 years ago

Sadly enough that was a rushed conclusion because apparently, yesterday, I was not able to reproduce the issue with the new changes after a few trials but today the issues still show up. I am also running the experiments for a longer period (20 secs). I have concluded that both inputs behave in the same way so I'm just going to show plots of one of them for the sake of simplicity. Each plot represents 10 runs of the same experiment. This is what I have so far:

Exp 1:

image image image image

Exp 2:

Exp 3:

image image

Exp 4:

image image image image image

Exp 5:

image image image image

Remarks:

There seems to be a random effect which is positively biased if the second output is driven with data (e.g 0's) or if the PTT parameter is set to False, however the corruption still shows up way too frequently.

Are you not able to reproduce this issue in any of these experiments?

Cheers!

pavel-demin commented 5 years ago

The second plot shows that one of the recorded signals has a lower frequency. So, the TX or RX samples were somehow delayed. I'd suspect one of the following:

fmagno commented 5 years ago

I ran the above experiments with two modern machines, both with a direct wired connection to the Red Pitaya:

Both setups show similar corruption but, with the Ubuntu setup, sometimes an error occurs with the following message:

I believe this error has nothing to do with the corruption. Looks like a different thing... The coaxial cables are in good condition.

Do you have a suggestion on how this can be debugged?

I wonder if you have come across the need of implementing a loopback test application that would run inside the Red Pitaya, like the sdr_transceiver, and would not require communication with an SDR running on a computer. It would just transmit a sine wave already stored in the SD card and would then store the received signal (implying that the IN and OUT ports would be physically connected for the test to make sense). We could then compare the signals offline.

pavel-demin commented 5 years ago

My GNU Radio test machine is much less powerful than your systems. My test machine is a small netbook with AMD E2-1800 CPU, Ubuntu 16.04 and GNU Radio 3.7.10.

Do you have a suggestion on how this can be debugged?

I think that the first step should be to understand what part of the signal processing chain doesn't work. I'd check the TX signal using an external oscilloscope and I'd check the RX part using an external signal generator.

fmagno commented 5 years ago

Hi Pavel,

Thanks for the suggestion. Follows a set of experiments that show the issue for different combination of parameters.

Computer specs:

Ubuntu 19.04
Intel Core i7-8750H CPU @ 2.20GHz × 12
kernel: 5.0.0-13-generic x86_64 x86_64 x86_64 GNU/Linux
NIC: Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
GNU Radio Companion 3.7.13.4

Exp 1

Here's a simpler flowgraph that shows the issue for different sample rates:

image

20K

image

50K

image

100K

image

250K

image

500K

image

1.25M

image

Surprisingly the output with the best result is the one with higher sample rate (1.25M).

Exp2

Now with both OUT ports:

image

20K

image

50K

image

100K

image

250K

image

500K

image

1.25M

image

It still shows better results for higher sample rates.

Exp3

Now, with only IN1 enabled:

image

20K

image

50K

image

250K

image

1.25M

image

Ex4

With both inputs enabled:

image

1.25M

image

Blue and green are the real components of the received signals. Red and orange are the imaginary components and are always 0 as expected.

Exp5 With IN1 and OUT1. Still, IN1 is being driven by a signal generator and OUT1 is being measured with the oscilloscope. Flowgraph:

image

Ran this experiment with 500K for 20 secs and with 1.25M for 80 secs:

500K

image image

The output signal looks good up until the 10th second and then it gets corrupted. It never recovers the correct wave shape.

1.25M [0-40] sec

image

[40-80] sec

image

In this case, the signal gets corrupted after few seconds but eventually, after second 30, it goes back to the correct shape.

Let me know your thoughts. Thanks again!

pavel-demin commented 5 years ago

Thanks for the new tests. So, looks like the problem is somewhere in the TX part.

While porting the SDR applications to the new 122.88-16 board, I've found that the new board is using a different pin compatible DAC chip (AD9767). The AD9767 data sheet contains some additional timing constraints. Maybe the recently produced 125-14 boards also come with the AD9767 DAC chip.

I've just backported the updated DAC interface from the code for the 122.88-16 board to the code for the 125-14 board. Here is a link to a SD card image with the updated code:

https://www.dropbox.com/sh/5fy49wae6xwxa8a/AAAn9CYl4hCq9MShSKpn2AV-a/red-pitaya-alpine-3.9-armv7-20190422.zip?dl=1

Could you please redo the first test (Exp 1) from your last comment using this new SD card image?

fmagno commented 5 years ago

I will double check with the oscilloscope tomorrow, but I can already say that the issue is still there by testing in closed loop:

image

50K

image

100K

image
fmagno commented 5 years ago

Hi Pavel, the corruption is still there. Confirmed with the oscilloscope, especially for sample rates below 500K.

pavel-demin commented 5 years ago

Thanks for testing the new version. So, the problem isn't in the DAC interface.

Maybe, as a next step, we should check if the problem isn't somewhere in GNU Radio. I think that the following script is equivalent to your grc script from Exp 1:

import struct
import socket
import numpy as np

addr = '192.168.5.100'
port = 1001

# open control socket
ctrl_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
ctrl_sock.connect((addr, port))
ctrl_sock.send(struct.pack('<I', 0))

# open data socket
data_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
data_sock.connect((addr, port))
data_sock.send(struct.pack('<I', 1))

# set frequency
corr = 0
freq = 0
ctrl_sock.send(struct.pack('<I', 0<<28 | int((1.0 + 1e-6 * corr) * freq)))

# set sample rate
rate_codes = {20000:0, 50000:1, 100000:2, 250000:3, 500000:4, 1250000:5}
rate = 50000
code = rate_codes[rate]
ctrl_sock.send(struct.pack('<I', 1<<28 | code))

# send data
time = np.arange(0, 10, 1 / rate) * np.pi * 2
signal = np.sin(time, dtype = np.complex64)
data_sock.sendall(signal.tobytes())
pavel-demin commented 5 years ago

I've fixed a couple of errors in the script. I think that it should be OK now.

fmagno commented 5 years ago

I made a slight fix to the time and signal variables (below). The OUT1 port is not producing the signal, though... If I change the control and data message back to the original values, 2 and 3 respectively then I can measure at OUT1 the corrupted signal.


import socket
import numpy as np

addr = '192.168.5.100'
port = 1001

# open control socket
ctrl_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
ctrl_sock.connect((addr, port))
ctrl_sock.send(struct.pack('<I', 0))                                       # <--- 2

# open data socket
data_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
data_sock.connect((addr, port))
data_sock.send(struct.pack('<I', 1))                                       # <--- 3

# set frequency
corr = 0
freq = 0
ctrl_sock.send(struct.pack('<I', 0<<28 | int((1.0 + 1e-6 * corr) * freq)))

# set sample rate
rate_codes = {20000:0, 50000:1, 100000:2, 250000:3, 500000:4, 1250000:5}
rate = 50000
code = rate_codes[rate]
ctrl_sock.send(struct.pack('<I', 1<<28 | code))

# send data
# time = np.arange(0, 10, 1 / rate) * np.pi * 2
# signal = np.sin(time, dtype = np.complex64)

time = np.arange(0, 10, 1. / rate)
signal = np.sin(2 * np.pi * time, dtype = np.complex64)

data_sock.sendall(signal.tobytes())
pavel-demin commented 5 years ago

Yes, you're right. I've just tested the script and found more errors. Here is a new version that works for me:

import struct
import socket
import numpy as np

addr = '192.168.5.100'
port = 1001

# open control socket
ctrl_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
ctrl_sock.connect((addr, port))
ctrl_sock.send(struct.pack('<I', 2))

# open data socket
data_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
data_sock.connect((addr, port))
data_sock.send(struct.pack('<I', 3))

# set frequency
corr = 0
freq = 0
ctrl_sock.send(struct.pack('<I', 0<<28 | int((1.0 + 1e-6 * corr) * freq)))

# set sample rate
rate_codes = {20000:0, 50000:1, 100000:2, 250000:3, 500000:4, 1250000:5}
rate = 50000
code = rate_codes[rate]
ctrl_sock.send(struct.pack('<I', 1<<28 | code))

# enable ptt
ctrl_sock.send(struct.pack('<I', 2<<28))

# send data
time = np.arange(0, 100, 1 / rate) * np.pi * 2
signal = np.sin(time, dtype = np.complex64)
data_sock.sendall(signal.tobytes())
pavel-demin commented 5 years ago

I've checked 100 periods with an oscilloscope and they all look OK. The script was running from a Ubuntu 16.04 virtual machine running under Windows 7 connected to Red Pitaya via Wi-Fi.

fmagno commented 5 years ago

No luck, mate... :( I indeed see the signal coming out as I mentioned before, by setting the control and data to 2 and 3 respectively (even if the ppt setting wasn't there before). I also fixed the math and the division but the issue is definitely not there:

time = np.arange(0, 10, 1. / rate)
signal = np.sin(2 * np.pi * time, dtype = np.complex64)

I find it strange that it worked for you, though, because 1 / rate should always evaluate to 0 (zero).

fmagno commented 5 years ago

I have also tried the STEMlab (Signal Generator) just to be sure that there is nothing wrong with hardware and it worked just fine:

image
pavel-demin commented 5 years ago

even if the ppt setting wasn't there before

Yes, you're right. I've just checked that the script works without enabling ptt.

I find it strange that it worked for you, though, because 1 / rate should always evaluate to 0 (zero).

I'm using Python 3:

# python3
Python 3.5.2
>>> 1 / 500
0.002

Anyway, if GNU Radio and DAC interface are OK, then the remaining parts are

I think that we could try to exclude the network connection by running the same script directly on the Red Pitaya board. I've just tried it and the following steps worked for me:

fmagno commented 5 years ago

I'm using Python 3

Ah OK, my bad - that makes sense. I assumed you were using python2 because it is not supported by GNU Radio.

Regarding the tx.py test here's what I got:

image

The application runs for a little while, producing the output shown above and then returns without errors. Btw, I am still using this image: https://www.dropbox.com/sh/5fy49wae6xwxa8a/AAAn9CYl4hCq9MShSKpn2AV-a/red-pitaya-alpine-3.9-armv7-20190422.zip?dl=1

So far I have been using a cable to directly connect the redpitaya and the laptop (tried with both straight and crossover), with a static IP on both devices, to replicate the issue.

Because you suggested installing a package I had to connect the redpitaya to the internet. So, connected the redpitaya via ethernet cable to a network with internet.

The interesting part is that when I connect the computer to the same network via ethernet cable the corruption is present, but if I connect the computer to the network via WiFi I don't manifestation of the corruption for low sample rates. Also, for the 1.25M sample rate (via WiFi) the corruption looks very different IMO:

WiFi, sample rate: 20KS/s:

image

WiFi, sample rate: 1.25MS/s:

image

Ethernet, sample rate: 20KS/s:

image

I'm having trouble making sense of these results

pavel-demin commented 5 years ago

If it works with the wireless connection and doesn't work with the wired connection, then the problem could be in the driver of the Ethernet controller.

Looks like the Ethernet controllers based on RTL8111/8168/8411 don't work well with the drivers installed by default: https://www.unixblogger.com/how-to-get-your-realtek-rtl8111rtl8168-working-updated-guide/

pavel-demin commented 5 years ago

Can we consider this issue resolved?

fmagno commented 5 years ago

Please keep it open. I'm currently abroad and unable to test it. I'll test it on Monday or Tuesday.

fmagno commented 5 years ago

Sorry, I had no chance to answer earlier because I got sick. The good news is that I finally managed to test it with the official Realtek drivers, following your suggestion, and IT WORKS!!

Many thanks!

pavel-demin commented 5 years ago

Thanks for confirming that the problem was in the driver of the Ethernet controller.