aicodix / modem

Simple OFDM modem for transceiving datagrams
BSD Zero Clause License
94 stars 19 forks source link

decode, computing efficiency #9

Open ea4gmz opened 4 months ago

ea4gmz commented 4 months ago

Hello, Not really an issue or a bug, but a question. I can run decode command in a linux computer with i5 cpu; the computing time is 50 ms. I have compiled your program in a Raspberry pi 1 model B with BCM2835 (ARM1176JZF-S 700 MHz). There, decode takes exactly 10 seconds to process an audio file. I must say the program works and provides a nice result. Do you think this is the time it must reasonably take, or could it be improved maybe with a different compiling method? It puzzles me that the Rpi is 200 times slower than a regular computer. The point of using a RPi is running, non-stop, a digipeater and http gateway, instead of using a desktop pc. Thank you and best regards.

xdsopl commented 4 months ago

That CPU does not have NEON, which then needs to be emulated for the Polar list decoder. It also has very little cache. You could try to lower the list size or replace list decoding with normal decoding and also lower the OSD decoder order or replace with RS decoding but this all will reduce receiver performance by a large margin. Instead maybe you should get a more modern CPU with NEON, like the one in the Raspberry Pi 400: It only needs 150 ms to decode Rattlegram messages with the untouched code from the short branch and in only 20 ms when lowering the OSD order to 3 instead of the default 4.

xdsopl commented 4 months ago

You should probably play with the OSD ORDER first and set it to two: CODE::OrderedStatisticsDecoder<255, 71, ORDER> osddec;

And then change the list SIZE to maybe four: typedef SIMD<code_type, SIZE> mesg_type;

See if that helps.

ea4gmz commented 4 months ago

Hello Ahmet Thanks for your prompt reply. I will try that. What file should I edit? Regards

El mar., feb. 20, 2024 a 9:41, Ahmet @.***> escribió:

You should probably play with the OSD ORDER first and set it to two: CODE::OrderedStatisticsDecoder<255, 71, ORDER> osddec;

And then change the list SIZE to maybe four: typedef SIMD<code_type, SIZE> mesg_type;

See if that helps.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

xdsopl commented 4 months ago

decode.cc of course ;-)

ea4gmz commented 4 months ago

Hi again, I recompiled with your instructions. Now decode runs very fast: time ./decode /dev/null encoded.wav symbol pos: 2326 coarse cfo: 1500 Hz oper mode: 14 call sign: ANONYMOUS demod .... done coarse sfo: 0.0669286 ppm finer cfo: 1500 Hz Es/N0 (dB): 30.9216 29.7801 30.3704 31.0269 bit flips: 0

real    0m0.233s user    0m0.179s sys     0m0.022s

in a Raspberry 1, without NEON instruction set. But I don't know how robust it will be. I will test in real RF scenario.

En martes, 20 de febrero de 2024, 09:54:53 CET, Ahmet Inan ***@***.***> escribió:  

decode.cc of course ;-)

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

xdsopl commented 4 months ago

keep me posted

ea4gmz commented 4 months ago

Hello,

My findings so far.

As I posted above, decode modified with your notes now runs very fast in a RPi 1. I wanted to make a comparison on how robust the alternative version was, and planned to record signals and have them processed by the two decode versions. I wanted to do that in a computer as it runs faster. Now I can see that the modified program ends with segmentation fault after showin Es/N0 data. This only happens in the PC, as it runs fine in a RPi. I also tried downloading the latest version of "modem" and compiling it again, unmodified, and it also gives a segfault. But it works in RPi. Summary: modem code, weeks old, unmodified. PC: ok. RPi 1: ok, but takes 10s to decode. modem code, weeks old, modified. PC: seg fault. RPi 1: ok, fast, robustness not yet compared. modem code, cloned today, unmodified. PC: seg fault. RPi 1: ok, but takes 10s to decode. I can provide more detailed information like logs, scripts, etc. regards

xdsopl commented 3 months ago

You should also update the code repository. I made a lot of SIMD related changes lately that might cause the problems you see. If that does not help, please add more info here, and please give me the output of: grep -m 1 flags /proc/cpuinfo

ea4gmz commented 3 months ago

Hello The modified program runs fast in a Raspberry pi 1 and the sensitivity or ability to decode weak signals seems just as good as the original program. I am doing more experiments. I plan to set up a permanent digipeater and RF to http gateway running on this old rpi. regards

xdsopl commented 3 months ago

That's good to hear. I did more improvements in the DSP and CODE repositories and would be interested to know how well they work on the old pi as well.

LieBtrau commented 1 month ago

Hi, Maybe slightly off-topic, but I've tested decoding on an ESP32: rattlegram-openmodem. It works but it's very slow. It takes about 37s to decode a 512 byte packet (next-branch).
I went here to find some answers. Your suggestions yield dramatic improvements.
Decoding the 48kHz sample (512bytes) now takes 6.4s instead of 37s. When reducing the sample rate to 8kHz, decoding only takes 1.5s. It's still not real time (what I had hoped), but it's maybe good enough.