Huge overhead - Githubissues

faku99 commented 3 years ago

Hello, We are interested in using DigitalRF in one of our professional projects. This project focuses on receiving RF data over multiple 10 Gbps connections.

Beforehand, we ran some benchmarks in order to measure the overhead introduced by several compression/storage solutions. DigitalRF is one of them. The benchmark consists of the following steps: n TCP servers listen on n different ports and each waits for a TCP client to connect. Once connected, the client sends random data. On reception, the server either writes the data on disk using the pwritev() function or passes the data to some third-party library (i.e. DigitalRF, HDF5, zstd).

Here are the results we measured:

Setup	n = 1 [Gbps]	n = 4 [Gbps]
`pwritev()` (baseline)	19.314	35.481
DigitalRF w/o compression	2.786	1.857
DigitalRF w/ zlib	1.666	1.323
DigitalRF w/ zstd	5.926	1.420
HDF5 w/o compression	21.925	33.957
HDF5 w/ zlib	24.407	22.431
HDF5 w/ zstd	23.779	21.883
zstd + `pwritev()`	4.658	12.276

Note: when compressing the data, we always used 1 as the compression level.

As you can see, DigitalRF introduces a HUGE overhead. Moreover, it seems that the performances are not consistent; we observed values going from 0.5 Gbps to 6 Gbps (using a single thread).

Did you run some performance benchmark? If so, did you observe anything similar?
And have you any idea from where this overhead could come from? We are willing to participate in the improvement of DigitalRF!

Lucas

ryanvolz commented 3 years ago

It's awesome that you're looking into this! Our internal applications don't reach anything close to these rates, so it's not something I've looked at. There's a benchmark program here, but I don't know how useful it is. We're happy to help with and accept improvements though!

My first question is about how you're interfacing with Digital RF. Are you going through the C library exclusively? That's what I'm assuming, but I want to check that we don't have to look at Python getting in the way and (most definitely) slowing things down. Then if we're dealing with the C library, I'll have to defer to @billrideout who is the primary author of that portion.

@billrideout Do you have any sense of where bottlenecks might be for achieving these high data rates?

jvierine commented 1 year ago

There is a case where reading digital rf results in slow speeds. I suspect this is because hdf5 needs to decompress one full file sized vector each time a significantly smaller sized vector is read. This could be resolved with caching one file in ram. This would avoid the need to decompress (and checksum) the data each time a small snippet is read.

Currently, the workaround is to read one full file sized vector at a time (see example).

I suspect the same is going on with write. You could try benchmarking write operations that utilize one file size. This theoretically should converge with hdf5 write speed (and the compression algorithm speed), assuming that my hunch is correct.

Here is a demonstration code for read.

from digital_rf import DigitalRFReader
import time
import numpy as n
d=DigitalRFReader("/media/j/fee7388b-a51d-4e10-86e3-5cabb0e1bc13/isr/2023-09-05/usrp-rx0-r_20230905T214448_20230906T040054/rf_data/")
print(d.get_channels())
b=d.get_bounds("zenith-l")

ipp=10000
file_size=1000000

n_sec=10
sr=1000000

n_samples=n_sec*sr
n_ipp = int(n_samples/ipp)

t0=time.time()
# read one ipp at a time
a=0.0
for i in range(n_ipp):
    z=d.read_vector_c81d(b[0]+i*ipp,ipp,"zenith-l")
    a+=n.sum(z)
t1=time.time()
print("no cache time %1.2f (s)"%(t1-t0))
print(a)

t0=time.time()
# read one file at a time
n_ipp_per_file=int(file_size/ipp)
n_files=int(n_samples/file_size)
a=0.0
for i in range(n_files):
    # contents of one file
    z=d.read_vector_c81d(b[0]+i*file_size,file_size,"zenith-l")
    for ipi in range(n_ipp_per_file):
        z_ipp=z[(ipi*ipp):(ipi*ipp+ipp)]
        a+=n.sum(z_ipp)

t1=time.time()
print("cache file time %1.2f (s)"%(t1-t0))
print(a)

MITHaystack / digital_rf

Huge overhead #32