Closed vsonnier closed 5 years ago
Heh, I didn't create this. I only optimized the iio_channel_convert to run in a tightly loop rather than be called and I added direct copy for the/my default case, also some twiddling with proper gain. I have unfinished code for DMA to speed things up further. But I like your idea too. (btw. I'm pretty nervous here about lack of strict code style too, but tried to keep very small patches with focus on functionality.)
Thanks @zuckschwerdt. Don't bother with code styling and such though, and push as you see fit. I don't know when I have time to test buffer management anyway so don't wait for me. It is just frustating that all your optimizations seems quite nullified by now because of improper buffer streaming or something.
The direct copy path reduces load noticeably, but even without the streaming on the platform itself (native on the Pluto's core) isn't excessive. It looked like a problem could be the USB setup. I heavily test with rtl_433 and noticed that smaller buffer sizes increase load much. Using SoapyRemote streaming 6Msps CS16 with buffer size 256k works well. I don't notice lag, drops or high load. CubicSDR on the other hand works somewhat at 2Msps but has problems with anything higher for me.
Your suggestions look worth exploring in any case and I'll toy with that too and see if I get a stable stream on CubicSDR.
(Direct copy: single channel, CS16 format, same endianess (both for RX and TX). This is the usual default format on native PlutoSDR hardware, also with SoapyRemote and reduces the cpu usage by a factor of 4 (20% utilization to 5% utilization) since we can just memcpy when iio_channel_convert() / iio_channel_convert_inverse() are nops.)
It looked like a problem could be the USB setup...
I have no trouble to output 10Mhz from RSP2 for a few % CPU on the host machine.
CubicSDR on the other hand works somewhat at 2Msps but has problems with anything higher for me.
Cubic works that way: (see SDRThread::readStream() in src/sdr/SoapySDRThread.cpp
)
SoapySDR::getStreamMTU
size representing the amount of complex-samples that expected to be retrieved in one SoapySDR::Device::readStream
operation. Then, in an infinite loop:
To sum up, CubicSDR always fetch MTU at a time, fast enough to build a batch worth of 1/60th second in samples, and process that batch.
So CubicSDR will "stutter" in 2 conditions:
Your suggestions look worth exploring in any case and I'll toy with that too and see if I get a stable stream on CubicSDR.
Take care of properly reset the various buffers on sample rate changes and such, that may be the origin of segfaults I got.
I have a question for you about the "Dual Core Hack" we see around for the PLUTO:
Currently I have the variable maxcpus
not defined:
Welcome to Pluto
pluto login: root
Password:
Welcome to:
______ _ _ _________________
| ___ \ | | | / ___| _ \ ___ \
| |_/ / |_ _| |_ ___ \ `--.| | | | |_/ /
| __/| | | | | __/ _ \ `--. \ | | | /
| | | | |_| | || (_) /\__/ / |/ /| |\ \
\_| |_|\__,_|\__\___/\____/|___/ \_| \_|
v0.30
http://wiki.analog.com/university/tools/pluto
# fw_printenv maxcpus
## Error: "maxcpus" not defined
# cat /proc/cpuinfo
processor : 0
model name : ARMv7 Processor rev 0 (v7l)
BogoMIPS : 666.66
Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpd32
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x3
CPU part : 0xc09
CPU revision : 0
processor : 1
model name : ARMv7 Processor rev 0 (v7l)
BogoMIPS : 666.66
Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpd32
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x3
CPU part : 0xc09
CPU revision : 0
Hardware : Xilinx Zynq Platform
Revision : 0003
Serial : 0000000000000000
If I define it though as advised by
fw_setenv maxcpus 1
I got only 1 Core.
So my question is, what is the original version ?
With my current "2-core" setup running CubicSDR at 10MHz sample rate, I got from top
:
Mem: 89328K used, 421636K free, 68K shrd, 0K buff, 44012K cached
CPU: 0% usr 14% sys 0% nic 85% idle 0% io 0% irq 0% sirq
Load average: 0.18 0.06 0.01 1/56 2573
PID PPID USER STAT VSZ %VSZ %CPU COMMAND
744 1 root S 73940 14% 14% /usr/sbin/iiod -D -n 3 -F /dev/iio_ffs
2533 1634 root R 2928 1% 0% top
Then if I do fw_setenv maxcpus 1
i got:
Mem: 87600K used, 423364K free, 68K shrd, 0K buff, 44012K cached
CPU: 1% usr 27% sys 0% nic 70% idle 0% io 0% irq 0% sirq
Load average: 0.05 0.01 0.00 1/50 1090
PID PPID USER STAT VSZ %VSZ %CPU COMMAND
737 1 root S 72916 14% 28% /usr/sbin/iiod -D -n 3 -F /dev/iio_ffs
878 1 root S 2800 1% 0% /bin/sh /sbin/update.sh
1070 899 root R 2928 1% 0% top
and one Core.
I take it fw_setenv maxcpus 1
is the default. Just fw_setenv maxcpus
will delete the env var and core restriction.
For what's worth, it looks not too hard to pull RX samples according to this example:
Turns out the buffer_size of 512k selected by set_buffer_size_by_samplerate() does not sit well with CubicSDR. Setting it to something sane like 64k removes nearly all choppiness. See the simple tweak in zuckschwerdt/SoapyPlutoSDR@f1d82b26cb2addc2fb17ae328cffd1665592d87b
I also ripped out the refill_thread to test that but it does not seem to make a difference for me, if you want to try it's in zuckschwerdt/SoapyPlutoSDR@1d9a5a8bf84879652c6b6ab862a9380194360c9d
I also noticed some bugs with the allocation, resize and free of buf. I'll pull that out as a hotfix tomorrow.
Thanks @zuckschwerdt. Please see PR #14 for my increments over @zuckschwerdt work, maybe we can push improvements togetehr there directly ?
Let's close this and simply continue in #14 to reduce spam.
Hello @cjcliffe, @guruofquality and @zuckschwerdt !
Charles and I received our new PLUTO toys, and quickly discovered problems on both Linux and Windows the first being that apparently the streaming was heavily stuttering in CubicSDR for any samplerate, and crashing the application when changing it.
Having a look on the code alone, here my first remarks:
rx_streamer::refill_thread
to feedreadStream
instead of using theiio_xxx
API directly. Indeed, most if not all others SoapySDR call directly pull-like specific deviceread
to pull samples from the device. So?while()
, and manuallock()
andunlock()
to makerx_streamer::refill_thread
work cleanly with the rest. Maybe just a feeeling, but...volatile
usage instead ofatomic
so it probably doesn't work as you would expect. (short answer: it doesn't work)When I have spare time I will experiment by replacing the buffer management by something that already works well like in SoapySDRPlay (and is complicated enough already) and see where it goes. Hopefully it will bug the same way for both.
Of course, I'll keep the structure and all the good stuff already there. (LUTs, direct-copy,...)
And yes, throw a ton of coments explaining stuff. And braces over ALL blocks. Because I'm a clean code nazi.
Please @zuckschwerdt don't get those coments badly I'm very grateful you created this !
Regards,
Vincent