Nuand / bladeRF

bladeRF USB 3.0 Superspeed Software Defined Radio Source Code
http://nuand.com
Other
1.15k stars 459 forks source link

bladeRF 2.0 micro dual RX/TX sample discontinuity #729

Closed karel closed 5 years ago

karel commented 5 years ago

When using the bladeRF 2.0 micro in MIMO (also just MI or MO) mode, samples go missing. This happens with the bladeRF-cli, gr-osmosdr, and SoapyBladeRF (https://github.com/pothosware/SoapyBladeRF/issues/26).

[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:565] Sample discontinuity detected @ buffer 0, message 2: Expected t=1019, got t=1311747200790053665
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:565] Sample discontinuity detected @ buffer 0, message 3: Expected t=1311747200790054173, got t=1019
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:565] Sample discontinuity detected @ buffer 0, message 4: Expected t=1527, got t=1311747200790053665
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:565] Sample discontinuity detected @ buffer 0, message 5: Expected t=1311747200790054173, got t=1527
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:565] Sample discontinuity detected @ buffer 0, message 6: Expected t=2035, got t=1311747200790053665
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:565] Sample discontinuity detected @ buffer 0, message 7: Expected t=1311747200790054173, got t=2035
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:565] Sample discontinuity detected @ buffer 1, message 0: Expected t=2543, got t=1311747200790053665
gasparka commented 5 years ago

I am experiencing this as well.

rtucker commented 5 years ago

What version of the libbladeRF library are you using? bladeRF-cli -e version will print out the relevant info.

Thanks! -rt

karel commented 5 years ago

Built from the master branch. Tried various previous versions as well but no help.

gasparka commented 5 years ago
  bladeRF-cli version:        1.7.1-git-9dd43725-dirty
  libbladeRF version:         2.2.0-git-9dd43725-dirty

  Firmware version:           2.3.2
  FPGA version:               0.10.2 (configured by USB host)

A way to reproduce this is to record a sweeping signal, there will be periodic gaps/jumps in the sweep. Thanks for looking into it!

gasparka commented 5 years ago

@rtucker we did some digging in the FPGA by inspecting the rx_timestamp values, did not see the 1311747200790054173 constant being transmitted; values were sequential without jumps. Thus, it looks like the bug is somewhere in libbladerf.

karel commented 5 years ago

It seems that the sync_rx function is incompatible with the incoming buffer structure when using metadata and thus results in discontinuities. This makes it impossible to, e.g., synchronize TX and RX. This is in addition to the issues in gr-osmosdr, which initially led to believe that samples go missing.

https://github.com/Nuand/bladeRF/blob/9dd43725c981ede12088dadfe8193365bec07bb2/host/libraries/libbladeRF/src/streaming/sync.c#L343

KarlL2 commented 5 years ago

I realized the issue I opened Strange timestamps with bladeRF 2.0 is probably the same as this one.

I don't see Sample discontinuity detected as I'm probably not in debug, and it happens even in single RX with no TX. In dual RX it seems to happen 50% of the time.

KarlL2 commented 5 years ago

Any update on this issue? Maybe Strange timestamps with bladeRF 2.0 should be closed, if it's indeed the same issue.

robertghilduta commented 5 years ago

@KarlL2 This bug/issue is now being looked into along with a fix for it. I hope to provide an update soon.

robertghilduta commented 5 years ago

This is likely a bug in the HDL and may be related to a timing constraint. I am able to reproduce the issue, however there does not appear to be a pattern to the manifestation. t=1311747200790053665 in hexadecimal is 0x1234432112344321 . The noteworthy part is that 0x12344321 is the value that should only appear in the reserved field, however it is appearing in the 64bit timestamp space twice.

robertghilduta commented 5 years ago

@karel @KarlL2 @gasparka here is the xA4 RBF and xA9 RBF for this branch This branch attempts to keep the RX FIFOs from getting into a situation where the meta FIFO is empty but the samples FIFO contains enough data for a GPIF transaction.

KarlL2 commented 5 years ago

I tried the xA4 RBF you provided and I don't get those timing issues even at 55MS/s (mono channel). I would previously have those jumps every other block of samples.

karel commented 5 years ago

@rghilduta, the xA4 RBF results in immediate transfer timeouts.

[bladeRF source] start: DEBUG: starting source
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:401] sync_rx: Worker is idle. Going to reset buf mgmt.
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:421] sync_rx: Reset buf_mgmt consumer index
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:436] sync_rx: Worker is now running.
[ERROR @ host/libraries/libbladeRF/src/streaming/sync.c:306] wait_for_buffer: Timed out waiting for buf_ready after 3000 ms
[bladeRF source] work: bladerf_sync_rx error: Operation timed out
[ERROR @ host/libraries/libbladeRF/src/backend/usb/libusb.c:1083] Transfer timed out for buffer 0x7f52d0006af0
[ERROR @ host/libraries/libbladeRF/src/backend/usb/libusb.c:1083] Transfer timed out for buffer 0x7f52d000ab00

Here's the simple GNU Radio graph to test dual RX with metadata enabled bladerf_dual_RX.grc.zip.

robertghilduta commented 5 years ago

@karel and @KarlL2 there is another branch to try with a few fixes and feature improvements. Please try the following branch . The RBFs are located here for the xA4 and xA9

kjopek commented 5 years ago

Found similar issues with bladeRF 1 x115:

[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:565] Sample discontinuity detected @ buffer 8, message 26: Expected t=1311747200790054173, got t=181443577
Ts: 1311747200790053665 Count: 508
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:565] Sample discontinuity detected @ buffer 8, message 31: Expected t=181446117, got t=779305138806604577
Ts: 181443577 Count: 2540
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:565] Sample discontinuity detected @ buffer 8, message 32: Expected t=779305138806605085, got t=1311747200790053665
Ts: 779305138806604577 Count: 508
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:565] Sample discontinuity detected @ buffer 8, message 33: Expected t=1311747200790054173, got t=181446625
Ts: 1311747200790053665 Count: 508
Ts: 181446625 Count: 1048576
mando238 commented 5 years ago

Edit: Please see my new post with output from bladerf-cli / libbladeRF for more info. For what its worth I am also experiencing this, even with the new branch (FPGA v 0.11) from above, Though now only when attempting multiple channel receive (receive on both Rx channels simultaneously) regardless of sample rate. It happens with Soapy and Osmocom, problem is nearly identical to 2 issues that were referenced/merged with this topic (https://github.com/Nuand/bladeRF/issues/730 https://github.com/pothosware/SoapyBladeRF/issues/26) . Receiving on a single Rx channel seems to work fine.

bladeRF-cli version: 1.7.1 libbladeRF version: 2.2.0

Firmware version: 2.3.2 FPGA version: 0.11.0 (configured by USB host)

Device is A4 Micro Rev 1.3. A note of improvement. During 2 channel receive, With 0.10 I could only receive 508 samples, regardless of other stream settings. When using 0.11 instead I can get to 10240 samples before issues start appearing. I haven't messed around too much with the newer branch beyond basic tests, I'll update if I find something interesting.

gorgiaxx commented 5 years ago

The similar issues with x115:

  bladeRF-cli version:        1.7.1-2018.12-rc3-2-ppaxenial
  libbladeRF version:         2.2.0-2018.12-rc3-2-ppaxenial

  Firmware version:           2.3.2
  FPGA version:               0.1.2 (configured by USB host)

The yate shows invalid timestamps

2019-07-10_16:20:34.301557 <ybts:NOTE> State changed WaitHandshake -> Running
Yate engine is initialized and starting up on ubuntu
2019-07-10_16:20:35.193308 <gsmtrx:NOTE> State changed Invalid -> Idle [0x7f4884025590]
2019-07-10_16:20:35.193776 <gsmtrx:NOTE> State changed Idle -> PowerOff [0x7f4884025590]
2019-07-10_16:20:35.258749 <bladerf/1:NOTE> Powered ON the radio [0x7f4884000eb0]
2019-07-10_16:20:35.265439 <gsmtrx:NOTE> State changed PowerOff -> PowerOn [0x7f4884025590]
2019-07-10_16:20:35.307966 <mbts:NOTE> GSMConfig.cpp:532:createCombinationI: Configuring combination I on C0T1
2019-07-10_16:20:35.318566 <mbts:NOTE> GSMConfig.cpp:532:createCombinationI: Configuring combination I on C0T2
2019-07-10_16:20:35.325506 <bladerf/1:MILD> RX buf_samples=508: 4 buffers: invalid timestamps (buf=ts/delta) 2=137273/1615 [0x7f4884000eb0]
2019-07-10_16:20:35.327199 <bladerf/1:MILD> RX buf_samples=508: 4 buffers: invalid timestamps (buf=ts/delta) 2=144390/1529 [0x7f4884000eb0]
2019-07-10_16:20:35.328086 <bladerf/1:MILD> RX buf_samples=508: 4 buffers: invalid timestamps (buf=ts/delta) 2=146588/674 [0x7f4884000eb0]
2019-07-10_16:20:35.348467 <bladerf/1:MILD> RX buf_samples=508: 4 buffers: invalid timestamps (buf=ts/delta) 2=178460/7996 [0x7f4884000eb0]
2019-07-10_16:20:35.350023 <bladerf/1:MILD> RX buf_samples=508: 4 buffers: invalid timestamps (buf=ts/delta) 2=192683/10667 [0x7f4884000eb0]
mando238 commented 5 years ago

With the new FPGA version: Osomocon still produces produce the same invalid timestamps/discontinuity errors reported by others. However, Multi channel RX causes buffer overruns regardless of stream settings with bladerf-cli and soapy, but single channel RX still works fine. Increasing the number of buffers (significantly to 512+) delays the overruns but they eventually appear. But it is odd as I've streamed from SDRs at much higher datarates on this system and never had an overrun issue before. So I want to call the overruns a separate issue, but am beginning to wonder if they are related to what is causing the timestamps issue. Output below:


bladeRF> version

  bladeRF-cli version:        1.7.1
  libbladeRF version:         2.2.0

  Firmware version:           2.3.2
  FPGA version:               0.11.0 (configured by USB host)

--------------Osmocom output-----------------------------
bladerf2_initialize: complete

[bladeRF source] Device: Nuand bladeRF 2.0 Serial # 2549...b26d FW v2.3.2 FPGA v0.11.0

[bladeRF source] bladerf_common::init: Buffers: 512, samples per buffer: 4096, active transfers: 32

[bladeRF source] bladerf_source_c::set_agc_mode: DEBUG: Setting gain mode to 1 (manual)

[bladeRF source] bladerf_source_c::bladerf_source_c: DEBUG: initialization complete

[DEBUG @ host/libraries/libbladeRF/src/board/bladerf2/bladerf2.c:1054] bladerf2_set_sample_rate: enabling 4x decimation/interpolation filters

gr::pagesize: no info; setting pagesize = 4096

[bladeRF source] bladerf_source_c::start: DEBUG: starting source

[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:401] sync_rx: Worker is idle. Going to reset buf mgmt.

[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:421] sync_rx: Reset buf_mgmt consumer index

[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:436] sync_rx: Worker is now running.

[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:565] Sample discontinuity detected @ buffer 34, message 3: Expected t=154046686, got t=154049825

[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:565] Sample discontinuity detected @ buffer 39, message 3: Expected t=154059985, got t=154060637

[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:565] Sample discontinuity detected @ buffer 116, message 3: Expected t=154217101, got t=154218093

[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:565] Sample discontinuity detected @ buffer 117, message 3: Expected t=154220125, got t=154222332

[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:565] Sample discontinuity detected @ buffer 118, message 3: Expected t=154224364, got t=154230308

........... many more of these errors until the stream is stopped

--------------bladerf-cli-------------
[VERBOSE @ host/libraries/libbladeRF/src/board/bladerf2/common.c:339] check_total_sample_rate: active_channels=0, rate_accum=0, maximum=80000000
  Setting RX2 sample rate - req:   2000000 0/1Hz, actual:   2000000 0/1Hz

bladeRF> set samplerate rx 2M

[VERBOSE @ host/libraries/libbladeRF/src/board/bladerf2/common.c:339] check_total_sample_rate: active_channels=0, rate_accum=0, maximum=80000000
  Setting RX1 sample rate - req:   2000000 0/1Hz, actual:   2000000 0/1Hz

bladeRF> rx config file=my_samples.csv format=csv n=20M channel=1,2 buffers=1024
bladeRF> rx

  State: Idle
  Channels: RX1, RX2
  Last error: None
  File: my_samples.csv
  File format: SC16 Q11, CSV
  # Samples: 20971520
  # Buffers: 1024
  # Samples per buffer: 32768
  # Transfers: 16
  Timeout (ms): 1000

bladeRF> rx start
[VERBOSE @ host/libraries/libbladeRF/src/backend/usb/nios_access.c:429] nios_config_read: Read 0x00010000
[VERBOSE @ host/libraries/libbladeRF/src/backend/usb/nios_access.c:440] nios_config_write: Wrote 0x00000000
bladeRF> [VERBOSE @ host/libraries/libbladeRF/src/board/bladerf2/common.c:339] check_total_sample_rate: active_channels=1, rate_accum=2000000, maximum=80000000
[VERBOSE @ host/libraries/libbladeRF/src/board/bladerf2/common.c:339] check_total_sample_rate: active_channels=2, rate_accum=4000000, maximum=80000000
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:401] sync_rx: Worker is idle. Going to reset buf mgmt.
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:421] sync_rx: Reset buf_mgmt consumer index
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:436] sync_rx: Worker is now running.

bladeRF> rx config file=my_samples.csv format=csv n=200M channel=1,2 buffers=1024
bladeRF> rx start
[VERBOSE @ host/libraries/libbladeRF/src/backend/usb/nios_access.c:429] nios_config_read: Read 0x00000000
[VERBOSE @ host/libraries/libbladeRF/src/backend/usb/nios_access.c:440] nios_config_write: Wrote 0x00000000
bladeRF> [VERBOSE @ host/libraries/libbladeRF/src/board/bladerf2/common.c:339] check_total_sample_rate: active_channels=1, rate_accum=2000000, maximum=80000000
[VERBOSE @ host/libraries/libbladeRF/src/board/bladerf2/common.c:339] check_total_sample_rate: active_channels=2, rate_accum=4000000, maximum=80000000
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:401] sync_rx: Worker is idle. Going to reset buf mgmt.
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:421] sync_rx: Reset buf_mgmt consumer index
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:436] sync_rx: Worker is now running.
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync_worker.c:100] RX overrun @ buffer 565
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync_worker.c:100] RX overrun @ buffer 612
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync_worker.c:100] RX overrun @ buffer 656
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync_worker.c:100] RX overrun @ buffer 692
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync_worker.c:100] RX overrun @ buffer 727
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync_worker.c:100] RX overrun @ buffer 768
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync_worker.c:100] RX overrun @ buffer 801
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync_worker.c:100] RX overrun @ buffer 838
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync_worker.c:100] RX overrun @ buffer 874
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync_worker.c:100] RX overrun @ buffer 909
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync_worker.c:100] RX overrun @ buffer 946
...................many more of these errors

--------------------Single Channel-------------------------
bladeRF> rx config file=my_samples.csv format=csv n=20M channel=1
bladeRF> rx start
[VERBOSE @ host/libraries/libbladeRF/src/backend/usb/nios_access.c:429] nios_config_read: Read 0x00000000
[VERBOSE @ host/libraries/libbladeRF/src/backend/usb/nios_access.c:440] nios_config_write: Wrote 0x00000000
bladeRF> [VERBOSE @ host/libraries/libbladeRF/src/board/bladerf2/common.c:339] check_total_sample_rate: active_channels=1, rate_accum=2000000, maximum=80000000
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:401] sync_rx: Worker is idle. Going to reset buf mgmt.
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:421] sync_rx: Reset buf_mgmt consumer index
[DEBUG @ host/libraries/libbladeRF/src/streaming/sync.c:436] sync_rx: Worker is now running.
robertghilduta commented 5 years ago

Please try release tag 2019.07, a good amount of the sample discontinuity issue should be resolved. This issue should be reserved for sampling discontinuities where "t=1311747200790054173" appears in the logs. Please note to use the fastest available storage medium such as a solid state drive when using bladeRF-cli with CSV files. If the issues persist or otherwise, please start a new issue.