Scope for optimization - Githubissues

fifteenhex commented 7 years ago

Not really an issue so much as wanting to reach out and maybe get involved if possible.

Long story short I'm trying to put together a package to build a LoRaWAN gateway using cheaper parts than the Semtech gateway chips. I'm aiming for a system using a cheap ARM board using an RTL SDR for receiving and a SX1276 module for transmitting.

Currently a common quad core 1.2ghz Cortex A7 board (OrangePi Zero) seems to be totally overwhelmed trying to decode a single channel and single spreading factor. A 2ghz quad A7 + quad A15 board (Odroid XU4) seems to just about be able to keep up.

For LoRaWAN I think at least 2 channels and multiple spreading factors need to decoded in realtime. Is this in the realm of possibility? What areas do you think are obvious targets for optimization? I'm not too familiar with gnuradio so this might be a stupid question but is there the possibility to maybe decode multiple SFs in parallel using multiple cores?

rpp0 commented 7 years ago

Hi, thanks for reaching out! I believe that decoding multiple channels simultaneously on a Pi is definitely within the realm of possibilities, and would require only minor changes to the code:

The current "internal" sampling rate of the decoder is 1 Msps, whereas a sampling rate of 125 - 250 Ksps could theoretically be used depending on the bandwidth. Consequently, there are also some resampling / decimation steps that could be removed. All this would significantly lower the required processing power per channel.
A channelizer must be developed to feed each channel to a separate LoRa Receiver block. I made something similar for the gr-gsm project, and it should be easy to port to gr-lora.

The reason why I didn't focus on this optimization yet is because I am currently prioritizing some issues regarding the decoding of higher SFs. If you want to get involved and help out with the decoding of multiple channels, that would be great! I can send you the traces that we used for regression testing of the decoder.

As for decoding multiple SFs simultaneously: I believe this would be a challenge and I haven't given it a lot of thought yet. It should be possible, but perhaps it is too computationally expensive for a single board.

fifteenhex commented 7 years ago

Thanks for getting back to me.

I think the bandwidth bit might be something I'll look at first. The odroid xu4 has no problems keeping up with the RTL SDR at 1Msps according to the rtl_test utility but the odroid board drops samples every so often. So running at a lower sampling rate might help in more places that one.

For multiple SFs: There is a single channel LoRaWAN gateway that apparently uses CAD to work out what SF is being used and quickly switches to it. Do you think some sort of automatic guessing of the SF would be possible?

rpp0 commented 7 years ago

Great, let me know if you have any questions.

I think determining SF could be done with something as simple as a cross-correlation. The problem is when two devices with different SFs transmit simultaneously. This would give some problems with the current decoding approach.

wwwzrb commented 5 years ago

Do you mean that multiple SFs can be determined by the preamble cross-correlation since different spreading factors may result in the different preamble samples?

rpp0 commented 5 years ago

I think one approach that could work would be: create a buffer large enough to contain two chirps sent at the largest spreading factor. Then perform autocorrelation on two windows of the signal, with the window sizes depending on the spreading factor. E.g. if the sample rate is 1 Msps, start with SF 7 and autocorrelate two 1024-sample windows. Then two 2048-sample windows for SF 8, etc. Finally, break if you find a high autocorrelation. Perhaps there are better ways though.

wwwzrb commented 5 years ago

Yeap, this is a good idea! If we want to distinguish SF 7 and SF 8 at 125 KHz band with 1 Msps sampling rate, do you mean that for SF 8 we need a window at least 4*2^8=1024-samples?

I have a similar idea but there are two intrinsic problems of this scheme remaining to be resolved.

I think the computation overhead will be too high since the sliding window will calculate the autocorrelation once when a new sample comes. Maybe we can choose a larger step rather than one, but the accuracy will also drop.
The choice the threshold is another problem. We may have to choose different thresholds under the dynamic and unstable channel situation.

Do you know some other more computational efficient ways, e.g., utilizing the FFT result after multiplied with down chirp?

rpp0 commented 5 years ago

At 1 Msps you would be oversampling 8 times, so 8*2^8 = 2048 samples. You're right about the problems with this approach but I don't know how it could be improved at this time.

rpp0 / gr-lora

Scope for optimization #31