MITHaystack / digital_rf

Read, write, and interact with data in the Digital RF and Digital Metadata formats
Other
97 stars 30 forks source link

request index zero before first expected index 50 in digital_rf_write_hdf5 #13

Closed alexchartier closed 3 years ago

alexchartier commented 4 years ago

Using gr_digital_rf 2.6.3, the code now gives me an index error when trying to launch a digital_rf_sink. It appears the latest commit addresses this issue - does the bug-fix come with 2.6.3?

My code can be run using run_rx which calls odin.py from github.com/alexchartier/sounder.

ryanvolz commented 4 years ago

The most recent commit (https://github.com/MITHaystack/digital_rf/commit/0891b9b350df6360b3658f4301f511b5fcf0733e) is not in 2.6.3, but a 2.6.4 release would certainly be good. Can you test whether that commit actually fixes your issue? Because it it doesn't then I'd want to get that fix in before a release. If it's easier than updating your whole installation, you could try just making the changes in that commit to your installed digital_rf_sink.py.

ryanvolz commented 4 years ago

Considering the description more now though, this could be something else. Your symptoms could describe trying to write data into a directory where data already exists at the start sample index (e.g. the sample index is not set based on the time or the time is wrong).

ryanvolz commented 4 years ago
Request index 0 before first expected index 50 in digital_rf_write_hdf5

I took a look in the C code. The indices referred to here are relative to the global start index provided at the start of the write. So somehow the sample counter is getting incremented by 50 (probably by writing 50 samples), but then the script is trying to write to the first (zero-index) sample again which it can't do because it already exists (or it thinks it does). How it gets in this state is the real question. Time issues would explain it, but it could be an accounting bug somewhere in the code.

alexchartier commented 4 years ago

Cleaning out the data directory did not help - I'll check out your recent bug-fix. I am re-setting the time, but I was doing that without problems before now.

alexchartier commented 4 years ago

The fix to digital_rf_sink.py did not work either. The code does create a 203k HDF5 file before dying. I see you have been busy though - there are a lot of improvements to the new Thor. I'm trying to incorporate as many of those in my version as possible before retrying. So two questions:

  1. Do you still set the time based on NTP? That's one thing I had to change for multistatic operation (needs to be GPS)

  2. Would you consider incorporating frequency-hopping into Thor? That way I can avoid re-merging all this down the road. My freq_stepper.py is very simple and could certainly be improved, but I think that capability could be useful for a range of applications where you want to monitor large parts of the spectrum but not use tons of bandwidth.

alexchartier commented 4 years ago

I have confirmed this is a bug in digital_rf, unrelated to my changes. The error can be reproduced as follows:

thor.py -m 192.168.10.2 -d "A:A" -c innisfree --type 'sc16' -f 5E6 -r 5E5 -i 10 /mnt/data

I installed gnuradio and libhdf5-dev via apt-get, and digital_rf v2.6.3 via pip. I'll try a downgrade and see what happens.

ryanvolz commented 4 years ago

The fix to digital_rf_sink.py did not work either. The code does create a 203k HDF5 file before dying. I see you have been busy though - there are a lot of improvements to the new Thor. I'm trying to incorporate as many of those in my version as possible before retrying. So two questions:

1. Do you still set the time based on NTP? That's one thing I had to change for multistatic operation (needs to be GPS)

2. Would you consider incorporating frequency-hopping into Thor? That way I can avoid re-merging all this down the road. My freq_stepper.py is very simple and could certainly be improved, but I think that capability could be useful for a range of applications where you want to monitor large parts of the spectrum but not use tons of bandwidth.

I haven't looked in detail at your odin (love the name!), but we're happy accepting changes back into thor if they make sense for general use.

1) Nothing has changed here. The script just assumes that your computer time is reasonably accurate (say by NTP) and then sets the USRP time according to the PPS that it receives, so basically your computer just needs to be within half a second of the GPS time for the USRP time to be accurately set to the GPS time. If that's not working as intended for your application, or there's something that we can add that makes it better but doesn't put additional constraints on a user's system, we'll take it.

2) Yes, absolutely. It's something I've thought about putting in many times but never had enough available time to do. Admittedly, again, I haven't looked at your implementation, but go ahead and send a pull request and I'll review it.

ryanvolz commented 4 years ago

I have confirmed this is a bug in digital_rf, unrelated to my changes. The error can be reproduced as follows:

thor.py -m 192.168.10.2 -d "A:A" -c innisfree --type 'sc16' -f 5E6 -r 5E5 -i 10 /mnt/data

I installed gnuradio and libhdf5-dev via apt-get, and digital_rf v2.6.3 via pip. I'll try a downgrade and see what happens.

Ok, this is good progress as it will give me something to test. My only issue is that it's harder for me to access hardware right now. So to be clear, this happens using that command with an empty /mnt/data/innisfree directory after writing part of an HDF5 file? Can you easily link me to the data that gets output, and maybe the text that thor.py spits out before crashing?

alexchartier commented 4 years ago

pi@raspberrypi:~ $ rm -rf innisfree/* pi@raspberrypi:~ $ thor.py -m 192.168.10.2 -d "A:A" -c innisfree --type 'sc16' -f 5E6 -r 5E5 -i 10 ./

Main boards: ['addr=192.168.10.2'] Subdevices: ['A:A'] Clock rates: [None] Clock sources: [''] Time sources: [''] Sample rate: 500000.0 Device arguments: ['recv_buff_size=100000000', 'num_recv_frames=512'] Stream arguments: [] Tune arguments: [] Antenna: [''] Bandwidth: [0] Frequency: [5000000.0] LO frequency offset: [0] LO source: [''] LO export: [None] Gain: [0] DC offset: [False] IQ balance: [None] Output channels: [0] Output channel names: ['innisfree'] Output sample rate: [50000.0] Output frequency: [False] Output scaling: [1.0] Output subchannels: [1] Output type: ['sc16'] Data dir: ./ Metadata: {} UUID: None Local time: Thu 2020-03-26 11:06:19 EDT Universal time: Thu 2020-03-26 15:06:19 UTC RTC time: n/a Time zone: America/New_York (EDT, -0400) System clock synchronized: no NTP service: active RTC in local TZ: no [INFO] [UHD] linux; GNU C++ version 8.2.0; Boost_106700; UHD_3.13.1.0-3 [INFO] [USRP2] Opening a USRP2/N-Series device... [INFO] [USRP2] Current recv frame size: 1472 bytes [INFO] [USRP2] Current send frame size: 1472 bytes [WARNING] [UDP] The recv buffer could not be resized sufficiently. Target sock buff size: 100000000 bytes. Actual sock buff size: 50000000 bytes. See the transport application notes on buffer resizing. Please run: sudo sysctl -w net.core.rmem_max=100000000 [WARNING] [UDP] The recv buffer could not be resized sufficiently. Target sock buff size: 100000000 bytes. Actual sock buff size: 50000000 bytes. See the transport application notes on buffer resizing. Please run: sudo sysctl -w net.core.rmem_max=100000000 [INFO] [USRP2] Detecting internal GPSDO.... [INFO] [GPS] Found an internal GPSDO: Jackson-Labs, FireFly , Firmware Rev 0.929 [INFO] [USRP2] Setting references to the internal GPSDO Waiting for reference lock...locked Using the following devices: ---- receiver channel 0 ------------------------------------------------------ Motherboard: N210r4 (192.168.10.2) | Daughterboard: LFRX (A) Subdev: A:A | Antenna: | Gain: 0.0 | Rate: 500000.0 Frequency: 5000000.005 (-5000000.005) | Bandwidth: 32000000.0

Latching at 1585235186.21 [INFO] [MULTI_USRP] 1) catch time transition at pps edge [INFO] [MULTI_USRP] 2) set times next pps (synchronously) Launch time: Thu Mar 26 15:06:30.000000 2020 (1585235190.0) .Request index 0 before first expected index 50 in digital_rf_write_hdf5 Traceback (most recent call last): File "/home/pi/.local/lib/python2.7/site-packages/gr_digital_rf/digital_rf_sink.py", line 563, in work data_blk_idxs, RuntimeError: Failed to write data

done

alexchartier commented 4 years ago

above is from 2.6.2 (unable to install 2.6.1 via pip). With 2.6.3, the error is as follows: pi@raspberrypi:~ $ thor.py -m 192.168.10.2 -d "A:A" -c innisfree --type 'sc16' -f 5E6 -r 5E5 -i 10 ./ Main boards: ['addr=192.168.10.2'] Subdevices: ['A:A'] Clock rates: [None] Clock sources: [''] Time sources: [''] Sample rate: 500000.0 Device arguments: ['recv_buff_size=100000000', 'num_recv_frames=512'] Stream arguments: [] Tune arguments: [] Antenna: [''] Bandwidth: [0] Frequency: [5000000.0] LO frequency offset: [0] LO source: [''] LO export: [None] Gain: [0] DC offset: [False] IQ balance: [None] Output channels: [0] Output channel names: ['innisfree'] Output sample rate: [50000.0] Output frequency: [False] Output scaling: [1.0] Output subchannels: [1] Output type: ['sc16'] Data dir: ./ Metadata: {} UUID: None Local time: Thu 2020-03-26 11:48:51 EDT Universal time: Thu 2020-03-26 15:48:51 UTC RTC time: n/a Time zone: America/New_York (EDT, -0400) System clock synchronized: yes NTP service: active RTC in local TZ: no [INFO] [UHD] linux; GNU C++ version 8.2.0; Boost_106700; UHD_3.13.1.0-3 [INFO] [USRP2] Opening a USRP2/N-Series device... [INFO] [USRP2] Current recv frame size: 1472 bytes [INFO] [USRP2] Current send frame size: 1472 bytes [WARNING] [UDP] The recv buffer could not be resized sufficiently. Target sock buff size: 100000000 bytes. Actual sock buff size: 50000000 bytes. See the transport application notes on buffer resizing. Please run: sudo sysctl -w net.core.rmem_max=100000000 [WARNING] [UDP] The recv buffer could not be resized sufficiently. Target sock buff size: 100000000 bytes. Actual sock buff size: 50000000 bytes. See the transport application notes on buffer resizing. Please run: sudo sysctl -w net.core.rmem_max=100000000 [INFO] [USRP2] Detecting internal GPSDO.... [INFO] [GPS] Found an internal GPSDO: Jackson-Labs, FireFly , Firmware Rev 0.929 [INFO] [USRP2] Setting references to the internal GPSDO Waiting for reference lock...locked Using the following devices: ---- receiver channel 0 ------------------------------------------------------ Motherboard: N210r4 (192.168.10.2) | Daughterboard: LFRX (A) Subdev: A:A | Antenna: | Gain: 0.0 | Rate: 500000.0 Frequency: 5000000.005 (-5000000.005) | Bandwidth: 32000000.0

Latching at 1585237737.2 [INFO] [MULTI_USRP] 1) catch time transition at pps edge [INFO] [MULTI_USRP] 2) set times next pps (synchronously) Launch time: Thu Mar 26 15:49:02.000000 2020 (1585237742.0) Sample index: 792618871000000

|innisfree|rx_time tag @ sample 23: 1585237742+2.22e-06 (79261887100000)..Request index 0 before first expected index 50 in digital_rf_write_hdf5 Traceback (most recent call last): File "/home/pi/.local/lib/python2.7/site-packages/gr_digital_rf/digital_rf_sink.py", line 594, in work self._Writer._channelObj, in_data, data_rel_samples, data_blk_idxs RuntimeError: Failed to write data

done

alexchartier commented 4 years ago

innisfree.zip

alexchartier commented 4 years ago

Not sure how to do a pull request - it's asking me to select two branches to compare. I made a separate repository so it doesn't quite make sense to compare them. But odin.py is basically thor.py with the addition of GPS time synchronization and frequency-stepping based on a file instead of launching to one frequency. The frequency stepping happens in freq_stepper.py.

ryanvolz commented 4 years ago

Not sure how to do a pull request - it's asking me to select two branches to compare. I made a separate repository so it doesn't quite make sense to compare them. But odin.py is basically thor.py with the addition of GPS time synchronization and frequency-stepping based on a file instead of launching to one frequency. The frequency stepping happens in freq_stepper.py.

A pull request would need to be from a fork of the digital_rf repository. You could fork through Github, clone to your local machine, create a new branch, and make the necessary changes to thor.py based on what you have in odin.py. Then push the changes up to Github and initiate the pull request from your branch with the changes.

I'm thinking that the bug has to do with time tags getting messed up when the stream goes through decimation. I haven't gotten to reproducing it just yet, but once I do I'll start digging into that section of the code.

alexchartier commented 4 years ago

So, when -i 10 is removed, we get a slightly different error (though it still seems to be missing the first 500 samples, just not decimating them now):

Sample index: 792654763500000

|innisfree|rx_time tag @ sample 0: 1585309527+2.22e-06 (792654763500001) 1 dropped samples..Request index 0 before first expected index 501 in digital_rf_write_hdf5 Traceback (most recent call last): File "/home/pi/.local/lib/python2.7/site-packages/gr_digital_rf/digital_rf_sink.py", line 594, in work self._Writer._channelObj, in_data, data_rel_samples, data_blk_idxs RuntimeError: Failed to write data

ryanvolz commented 4 years ago

I haven't been able to reproduce this yet, and it doesn't help that I can't get an N200 to test with. I started some cleanups in a pull request #14, and two of those commits would provide some useful debugging output for your situation. Can you test by running from that branch on my github? If that's too much, you could try to make the changes to digital_rf_sink.py in the two relevant commits: https://github.com/MITHaystack/digital_rf/pull/14/commits/936d48746085a4c86bac9569e033a6ffd5755bdf and https://github.com/MITHaystack/digital_rf/pull/14/commits/063a993753bfc735ce17fa37bc4bf1846110f8b6. Those might provide me with more information about what is happening.

alexchartier commented 4 years ago

Ryan,

I pulled your branch and followed the install instructions. First thing to note is my thor.py binary did not get replaced. Even after mkdir build; cd build; cmake ..; make; sudo make install; the old pip executable remained:

ls -hlrt /home/pi/.local/bin/thor.py -rwxr-xr-x 1 pi pi 68K Mar 26 11:48 /home/pi/.local/bin/thor.py

I tried building/installing the examples but no change there.

So I went to python/tools and ran thor from there:

python thor.py -m 192.168.10.2 -d "A:A" -c innisfree --type 'sc16' -f 5E6 -r 5E5 -i 10 /mnt/data

Sample index: 793268104000000 Traceback (most recent call last): File "thor.py", line 1839, in args.func(args) File "thor.py", line 1833, in _run_thor thor.run(**runopts) File "thor.py", line 1061, in run debug=op.verbose, TypeError: init() got an unexpected keyword argument 'stop_on_time_tag'

So, commented out stop_on_time_tag argument and got back to the start:

Sample index: 793268225500000

|innisfree|rx_time tag @ sample 23: 1586536451+2.22e-06 (79326822550000)..Request index 0 before first expected index 50 in digital_rf_write_hdf5 Traceback (most recent call last): File "/home/pi/.local/lib/python2.7/site-packages/gr_digital_rf/digital_rf_sink.py", line 594, in work self._Writer._channelObj, in_data, data_rel_samples, data_blk_idxs RuntimeError: Failed to write data

done

ryanvolz commented 4 years ago

From the traceback, it's still running the 2.6.3 version of digital_rf_sink.py that you have installed in /home/pi/.local. The git version probably installed in /usr/local, but the old installation in /home/pi/.local has precedence. If you pip uninstall it, you might be able to get it to find the git version (and run thor.py from /usr/local/bin by default as well).

As a last resort, you can copy the digital_rf_sink.py from the source directory over the version in /home/pi/.local/lib/python2.7/site-packages/gr_digital_rf.

alexchartier commented 4 years ago

it seems to be running on Python 3 now. Not sure why it is not finding gnuradio:

pi@raspberrypi:~ $ thor.py Traceback (most recent call last): File "/usr/local/bin/thor.py", line 27, in import gr_digital_rf as gr_drf File "/usr/local/lib/python3.7/dist-packages/gr_digital_rf/init.py", line 3, in from .digital_rf_source import digital_rf_source, digital_rf_channel_source File "/usr/local/lib/python3.7/dist-packages/gr_digital_rf/digital_rf_source.py", line 15, in import gnuradio.blocks ModuleNotFoundError: No module named 'gnuradio'

pi@raspberrypi:~/digital_rf/build $ which thor.py /usr/local/bin/thor.py

pi@raspberrypi:~ $ sudo apt-get install gnuradio Reading package lists... Done Building dependency tree
Reading state information... Done gnuradio is already the newest version (3.7.13.4-4+b1). 0 upgraded, 0 newly installed, 0 to remove and 88 not upgraded.

ryanvolz commented 4 years ago

So now we need to get it to install for Python 2.7 (GNU Radio 3.7 only supports Python 2.7). When you run cmake, try adding -DPython_EXECUTABLE=/usr/bin/python2 to the command so it becomes cmake -DPython_EXECUTABLE=/usr/bin/python2 .. when run from the build directory.

alexchartier commented 4 years ago

OK, so:

pi@raspberrypi:~/digital_rf/build $ thor.py -m 192.168.10.2 -d "A:A" -c innisfree --type 'sc16' -f 5E6 -r 5E5 -i 10 /mnt/data Main boards: ['addr=192.168.10.2'] Subdevices: ['A:A'] Clock rates: [None] Clock sources: [''] Time sources: [''] Sample rate: 500000.0 Device arguments: ['recv_buff_size=100000000', 'num_recv_frames=512'] Stream arguments: [] Tune arguments: [] Antenna: [''] Bandwidth: [0] Frequency: [5000000.0] LO frequency offset: [0] LO source: [''] LO export: [None] Gain: [0] DC offset: [False] IQ balance: [None] Output channels: [0] Output channel names: ['innisfree'] Output sample rate: [50000.0] Output frequency: [False] Output scaling: [1.0] Output subchannels: [1] Output type: ['sc16'] Data dir: /mnt/data Metadata: {} UUID: None Local time: Fri 2020-04-10 14:50:54 EDT Universal time: Fri 2020-04-10 18:50:54 UTC RTC time: n/a Time zone: America/New_York (EDT, -0400) System clock synchronized: yes NTP service: active RTC in local TZ: no [INFO] [UHD] linux; GNU C++ version 8.2.0; Boost_106700; UHD_3.13.1.0-3 [INFO] [USRP2] Opening a USRP2/N-Series device... [INFO] [USRP2] Current recv frame size: 1472 bytes [INFO] [USRP2] Current send frame size: 1472 bytes [WARNING] [UDP] The recv buffer could not be resized sufficiently. Target sock buff size: 100000000 bytes. Actual sock buff size: 50000000 bytes. See the transport application notes on buffer resizing. Please run: sudo sysctl -w net.core.rmem_max=100000000 [WARNING] [UDP] The recv buffer could not be resized sufficiently. Target sock buff size: 100000000 bytes. Actual sock buff size: 50000000 bytes. See the transport application notes on buffer resizing. Please run: sudo sysctl -w net.core.rmem_max=100000000 [INFO] [USRP2] Detecting internal GPSDO.... [INFO] [GPS] Found an internal GPSDO: Jackson-Labs, FireFly , Firmware Rev 0.929 [INFO] [USRP2] Setting references to the internal GPSDO Waiting for reference lock...locked Using the following devices: ---- receiver channel 0 ------------------------------------------------------ Motherboard: N210r4 (192.168.10.2) | Daughterboard: LFRX (A) Subdev: A:A | Antenna: | Gain: 0.0 | Rate: 500000.0 Frequency: 5000000.005 (-5000000.005) | Bandwidth: 32000000.0

Latching at 1586544660.2 [INFO] [MULTI_USRP] 1) catch time transition at pps edge [INFO] [MULTI_USRP] 2) set times next pps (synchronously) Launch time: Fri Apr 10 18:51:04.000000 2020 (1586544664.0) Sample index: 793272332000000

|innisfree|rx_time tag @ sample 23: 1586544664+2.22e-06 (79327233200000)..Request index 0 before first expected index 50 in digital_rf_write_hdf5 Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/gr_digital_rf/digital_rf_sink.py", line 607, in work self._Writer._channelObj, in_data, data_rel_samples, data_blk_idxs RuntimeError: Failed to write data

done

alexchartier commented 4 years ago

Previously a similar problem was cured by this action:

in gnuradio/gr-uhd/lib/usrp_source_impl.cc, comment out line 115: _tag_now = true

But I can't do that with the apt-get gnuradio.

ryanvolz commented 4 years ago

This most recent output still looks like 2.6.3, sadly. It doesn't have any of the debug prints that I added, and the line number in the traceback doesn't match what it should. Did you checkout the start_sample_fixes branch before building/installing? That's my best guess.

alexchartier commented 4 years ago

No I didn't. Here's the version with that:

pi@raspberrypi:~/digital_rf/build $ thor.py -m 192.168.10.2 -d "A:A" -c innisfree --type 'sc16' -f 5E6 -r 5E5 -i 10 /mnt/data Main boards: ['addr=192.168.10.2'] Subdevices: ['A:A'] Clock rates: [None] Clock sources: [''] Time sources: [''] Sample rate: 500000.0 Device arguments: ['recv_buff_size=100000000', 'num_recv_frames=512'] Stream arguments: [] Tune arguments: [] Antenna: [''] Bandwidth: [0] Frequency: [5000000.0] LO frequency offset: [0] LO source: [''] LO export: [None] Gain: [0] DC offset: [False] IQ balance: [None] Output channels: [0] Output channel names: ['innisfree'] Output sample rate: [50000.0] Output frequency: [False] Output scaling: [1.0] Output subchannels: [1] Output type: ['sc16'] Data dir: /mnt/data Metadata: {} UUID: None Local time: Fri 2020-04-10 15:39:42 EDT Universal time: Fri 2020-04-10 19:39:42 UTC RTC time: n/a Time zone: America/New_York (EDT, -0400) System clock synchronized: yes NTP service: active RTC in local TZ: no [INFO] [UHD] linux; GNU C++ version 8.2.0; Boost_106700; UHD_3.13.1.0-3 [INFO] [USRP2] Opening a USRP2/N-Series device... [INFO] [USRP2] Current recv frame size: 1472 bytes [INFO] [USRP2] Current send frame size: 1472 bytes [WARNING] [UDP] The recv buffer could not be resized sufficiently. Target sock buff size: 100000000 bytes. Actual sock buff size: 50000000 bytes. See the transport application notes on buffer resizing. Please run: sudo sysctl -w net.core.rmem_max=100000000 [WARNING] [UDP] The recv buffer could not be resized sufficiently. Target sock buff size: 100000000 bytes. Actual sock buff size: 50000000 bytes. See the transport application notes on buffer resizing. Please run: sudo sysctl -w net.core.rmem_max=100000000 [INFO] [USRP2] Detecting internal GPSDO.... [INFO] [GPS] Found an internal GPSDO: Jackson-Labs, FireFly , Firmware Rev 0.929 [INFO] [USRP2] Setting references to the internal GPSDO Waiting for reference lock...locked Using the following devices: ---- receiver channel 0 ------------------------------------------------------ Motherboard: N210r4 (192.168.10.2) | Daughterboard: LFRX (A) Subdev: A:A | Antenna: | Gain: 0.0 | Rate: 500000.0 Frequency: 5000000.005 (-5000000.005) | Bandwidth: 32000000.0

Latching at 1586547588.21 [INFO] [MULTI_USRP] 1) catch time transition at pps edge [INFO] [MULTI_USRP] 2) set times next pps (synchronously) Launch time: Fri Apr 10 19:39:52.000000 2020 (1586547592.0) Sample index: 793273796000000

|innisfree|start @ sample 0: 70791+0.99954 (79327379599977) |innisfree|rx_time tag @ sample 23: 1586547592+2.22e-06 (79327379600000)..Request index 0 before first expected index 50 in digital_rf_write_hdf5 Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/gr_digital_rf/digital_rf_sink.py", line 620, in work self._Writer._channelObj, in_data, data_rel_samples, data_blk_idxs RuntimeError: Failed to write data


Writing data failed in 'rf_block_write' with state: self._start_sample = 79327379599977 self._next_rel_sample = 0 and inputs (in_data, data_rel_samples, data_blk_idxs): data_rel_samples = [0] data_blk_idxs = [0] in_data.dtype = [('r', '<i2'), ('i', '<i2')] in_data.shape = (8650,) in_data[:10] = [(0, 0) (0, 0) (0, 0) (0, 0) (0, 0) (0, 0) (0, 0) (0, 0) (0, 0) (0, 0)]


done

ryanvolz commented 4 years ago

This does give me something to go on, although I'm still stumped for now.

alexchartier commented 4 years ago

OK thanks for looking into this. As I said above, I think we hit something like this problem before and Juha fixed it with that commented out line in usrp_source_impl.cc. But I don't understand the working of all this - the error is in digital_rf_sink and I thought "sink" was for transmitting. In any case, I think when the receiver sets itself up, you get a few samples recorded, and then it sets itself back to zero before launching, and that's when things go wrong.

KCollins commented 4 years ago

"Who were you, DenverCoder9? What did you see?!?"

I'm running into the same issue with an RTL-SDR and a Raspberry Pi, using GNURadio Companion to pipe from the RTL into a DigitalRF sink. (I started out from this Raspberry Pi image: https://pisdr.luigi.ltd/)

It appears to record ten samples and then crash. Console output is below; any advice is appreciated.


Generating: '/home/pi/Documents/top_block.py'

Executing: /usr/bin/python -u /home/pi/Documents/top_block.py

gr-osmosdr 0.1.5 (0.1.5) gnuradio 3.7.14.0 built-in source types: file fcd rtl rtl_tcp hackrf rfspace airspy soapy redpitaya Found Rafael Micro R820T tuner Using device #0 Realtek RTL2838UHIDIR SN: 00000001 Found Rafael Micro R820T tuner [R82XX] PLL not locked! Invalid sample rate: 32000 Hz .Request index 0 before first expected index 4096 in digital_rf_write_hdf5 Traceback (most recent call last): File "/home/pi/.local/lib/python2.7/site-packages/gr_digital_rf/digital_rf_sink.py", line 620, in work self._Writer._channelObj, in_data, data_rel_samples, data_blk_idxs RuntimeError: Failed to write data


Writing data failed in 'rf_block_write' with state: self._start_sample = 51129970123306 self._next_rel_sample = 0 and inputs (in_data, data_rel_samples, data_blk_idxs): data_rel_samples = [0] data_blk_idxs = [0] in_data.dtype = complex64 in_data.shape = (4064,) in_data[:10] = [-0.00312501-0.01093751j 0.00468749+0.00468749j -0.01093751+0.00468749j -0.01093751-0.00312501j 0.00468749-0.01093751j 0.00468749+0.00468749j -0.00312501+0.01249999j -0.01093751-0.00312501j 0.00468749-0.00312501j 0.00468749+0.00468749j]


ryanvolz commented 4 years ago

Thanks for the report! This should be helpful in tracking down the issue, since it means it's not related to UHD or the thor.py script. Commonalities so far include the Raspberry Pi and GNU Radio 3.7 (necessitating Python 2.7).

The next two days are busy for me, but I'll plan on taking another look on Friday.

ryanvolz commented 4 years ago

I found some questionable integer typing in the C code that I could imagine impacting 32-bit builds. Is that what you are using with the Pi? Are there any warnings when you build that might shed some light? I made some corrections in #19 that could help if that was indeed the issue, but I'm not confident that is actually the problem.

The key bit that I can't figure out behavior-wise happens at https://github.com/MITHaystack/digital_rf/blob/master/python/gr_digital_rf/digital_rf_sink.py#L619-L621. The rf_block_write method should return the index of the next sample to be written and set it to self._next_rel_sample. One block of data is written before the error, so that value should be some positive value, e.g. 4096. But when the second chunk of data is written, we can see from your error message that self._next_rel_sample is still 0 but it should be 4096 according to the C code. So somehow that value is not getting returned and stored correctly back in Python-land. The fixes in #19 might help with getting that value back correctly into Python with 32-bit architectures. :crossed_fingers:

alexchartier commented 4 years ago

I think I got the same problem with a 64-bit laptop. I don't have the Pi or laptop setups to hand any more, but pretty sure it was the same behavior from stock Thor for me. When I get a N200/210 back I'll try with a new build.

ryanvolz commented 3 years ago

This should be fixed now, and the fix will appear in the soon-to-be-release 2.6.6 version.