Investigate QMesh, Reticulum and doing FEC in software

geeksville commented 4 years ago

Lots of good ideas in https://github.com/faydr/QMesh/wiki/Theory (and associated project) by @faydr

Thanks @CycloMies

geeksville commented 4 years ago

In addition to this investigation @Anelito reminds us to more rigorously investigate/consider https://github.com/markqvist/Reticulum by @markqvist.

markqvist commented 4 years ago

Feel free to ask me any questions you might have about Reticulum, and I'll do my best to answer :)

faydr commented 4 years ago

QMesh guy here -- I'm happy to answer any questions you may have about QMesh and the ideas underlying it.

One piece of advice: you should strongly consider doing your own forward error correction on top of/instead of the Hamming codes built into the LoRa PHY. Using a better FEC does seem to improve the Rx sensitivity by at least a few dB, and isn't that hard to implement.

faydr commented 4 years ago

Feel free to ask me any questions you might have about Reticulum, and I'll do my best to answer :)

@markqvist Is Reticulum intended to be a Delay-Tolerant Network (DTN) protocol? I was looking at the repo, and besides the reference to "high latency", I see a filename called "bundle.py". Since the most famous DTN is called the Bundle Protocol (BP), I was wondering if you were using ideas/algorithms/code from it.

geeksville commented 4 years ago

thanks ya'll. will do!

geeksville commented 4 years ago

wow - good point about doing FEC in software! I would have never thought of that if you hadn't mentioned it @faydr .

geeksville commented 4 years ago

This issue has been mentioned on Meshtastic. There might be relevant details there:

https://meshtastic.discourse.group/t/ideas-for-future-developments-a-recap/494/2

markqvist commented 4 years ago

@faydr Yes, Reticulum can operate in both real-time and delay tolerant modes. As you guessed, the Bundle class naming is inspired from RFC 5050, since it nicely conveys the functionality. I’m currently implementing the DTN features, and I expect to push them to the repo in July.

You will then be able to use delay-tolerant transport, both for single packets, and for entire bundles of data, where custody for the bundle is transferred hop-for-hop through the network.

cyclomies commented 4 years ago

Thank you @markqvist and @faydr for joining the discussion!

@geeksville, as the Meshtastic gets more complex, I do suggest following the OSI model.

While the cryptographic "engine" is well separated from the naive flooding protocol, there might be some parts of the protocol code to be separated more clearly?

Especially, when the layer 1 (PHY) and FEC handlings are implemented. Meshtastic could be run, not only on LoRa, but also on other types of PHY interfaces in the future (if the code layers could support a clear way to implement those).

stevewa2066 commented 3 years ago

Just wondering if all 3 of you are still chatting? Meshtastic, Reticulum and Qmesh.

Steve

stevewa2066 commented 3 years ago

Just wondering if all 3 of you are still chatting? Meshtastic, Reticulum and Qmesh.

Steve

faydr commented 3 years ago

I'm still around, if anyone has any questions about QMesh.

sachaw commented 2 years ago

I'm tacking this onto the https://github.com/meshtastic/Meshtastic-device/projects/1 would be good to implement before it becomes too much work.

sachaw commented 2 years ago

As of now, FEC is being used (implemented in hardware)

sachaw commented 2 years ago

I believe with the work @mc-hamster has done on the new routing, this can be closed?

mc-hamster commented 2 years ago

This should be kept open. It’s a different routing proposal.

mc-hamster commented 2 years ago

This should be kept open. It’s a different routing proposal.

faydr commented 2 years ago

If I understand correctly, the hardware FEC would be the hardware FEC in the LoRa chipsets. This algorithm is inferior to other algorithms that could be done in software, like Golay, BCH, Reed-Solomon, convolutional coding, LDPC, etc.

garthvh commented 2 years ago

Alas, no one has picked this up.

faydr commented 1 year ago

I'm now starting to play around a bit with Meshtastic. My brief perusing of the firmware source makes me think that it wouldn't be too difficult to add in some Reed-Solomon FEC into LoRa packet payload. Doing so should help lower PER in the presence of interference.

I'm willing and able to work on this if there's still some interest.

mc-hamster commented 1 year ago

We take advantage of the FEC afforded by the lora protocol in the modem configuration. No need to do it again in software. By the time the packet gets to us, it's already well qualified.

faydr commented 1 year ago

Thing is, the FEC built into the LoRa modem is not very good, it's just a simplistic Hamming code. Better FEC can do a better job (more bandwidth-efficient, better interference resistance, etc.). See below for an example of how using a better FEC algorithm helps improve range:

https://www.slideshare.net/haystacktech/how-to-triple-the-range-of-lora

markqvist commented 1 year ago

Hi @faydr

Interesting stuff!

I was actually under the impression that there was not really a way to do this with the common LoRa chips, but after reading the slides you linked, and going a bit deeper into the SX1276 datasheet, I see that this is a very interesting prospect actually!

Do you have any experience with implementing LDPC schemes? Maybe you would be interested in collaborating on implementing such a mode for the RNode Firmware? It would be very, very useful for use with Reticulum and physically mobile applications like Sideband (when combined with an RNode).

Maybe we can beat their "Haystack XR Mode" performance :D

GUVWAF commented 1 year ago

A while ago I looked at Arduino-compatible FEC implementations in software. There're not that many that can handle variable payload lengths. Eventually I found one: https://github.com/Warchant/reed-solomon_syndrome_gf256 Though, these algorithms require heavy computations and you would need to configure the radio to even pass on packets that have a failed CRC, so it would also cost quite some clock cycles. Not sure if a link budget improvement of 2-3dB is worth it, but we can only find out how much it actually is when we try it :)

mc-hamster commented 1 year ago

A while ago I looked at Arduino-compatible FEC implementations in software. There're not that many that can handle variable payload lengths. Eventually I found one: https://github.com/Warchant/reed-solomon_syndrome_gf256

Though, these algorithms require heavy computations and you would need to configure the radio to even pass on packets that have a failed CRC, so it would also cost quite some clock cycles. Not sure if a link budget improvement of 2-3dB is worth it, but we can only find out how much it actually is when we try it :)

Compared to what we are using now, what sort of improvement in error correction, sensitivity and throughput would we realize?

GUVWAF commented 1 year ago

Compared to what we are using now, what sort of improvement in error correction, sensitivity and throughput would we realize?

It's hard to give exact numbers. I've read 2-3dB improvement somewhere, and the above slideshow claims even tripling of range. Usually the algorithms are flexible with the amount of error correcting bits (i.e. redundancy) you add. The more, the higher the link budget improvement, but it eats the throughput.

Right now Meshtastic uses a LoRa Coding Rate of 4/8, but if that is really a bad FEC, it might be better to lower that in order to increase throughput, and add more error correcting bits to a better FEC implementation.

faydr commented 1 year ago

@GUVWAV -- My WAG is that using RS FEC would improve sensitivity by 1-3dB, perhaps more if there's a lot of busty inference in the band.

As for CPU usage -- the fairly fast MCUs (64MHz M4 for the nRF52 seems to be the minimum hardware used for Meshtastic) should have enough CPU for decoding to not be that big of a deal. The M17 project, for example, uses convolutional coding as one of the FEC algorithms. What can be tricky with using convolutional coding is properly interleaving the bits, which can be made more challenging with variable-length packets. LDPC, Turbo, and Polar codes are more CPU (and memory) intensive than convolutional or RS, but there's probably something that we can make work for the low datarates supported by Meshtastic.

I've been looking at the Meshtastic code, and it seems pretty well-structured and easy to work with. As a result, I might try to build my own QMesh router into Meshtastic and see how well it works out. @markqvist In the process of doing this, I might give another stab at writing/porting and MCU-friendly LDPC encoder/decoder.

mc-hamster commented 1 year ago

Compared to what we are using now, what sort of improvement in error correction, sensitivity and throughput would we realize?

It's hard to give exact numbers. I've read 2-3dB improvement somewhere, and the above slideshow claims even tripling of range. Usually the algorithms are flexible with the amount of error correcting bits (i.e. redundancy) you add. The more, the higher the link budget improvement, but it eats the throughput.

Right now Meshtastic uses a LoRa Coding Rate of 4/8, but if that is really a bad FEC, it might be better to lower that in order to increase throughput, and add more error correcting bits to a better FEC implementation.

Good intel!

meshtastic / firmware

Investigate QMesh, Reticulum and doing FEC in software #192