How to decode messages in implicit mode (with CR different from CR4/8) ?

ThGeoffrey commented 6 years ago

Hey !

Once again, I like all the job that had been done from then on, congratulations !

I'm trying to make a gr-LoRa fully functional. To do that I'd like to be able to decode messages using implicit mode (ie without header) for all the coding rates (CR4/5, CR4/6, CR4/7 CR4/8). At the moment, this is only working for CR4/8.

I started to study the code, have you got any idea of what I shoulf change to make it work ?

Thanks in advance, Geoffrey

ThGeoffrey commented 6 years ago

@rpp0 Any idea ? I fail to make your code working in implicit mode for !CR8... This is strange...

rpp0 commented 6 years ago

You can compile gr-lora in debug mode by providing the -DDEBUG=ON flag to cmake and run the decoder for CR4-7. The file created in /tmp/grlora_debug_txt might give a hint on why it doesn't work. However, I don't have the time right now to follow up on this issue. There shouldn't be anything particularly different from decoding CR7 vs CR8 besides the number of LoRa symbols per block.

ThGeoffrey commented 6 years ago

Okay ! Thanks for your quick answer... I'll try to make it work in implicit mode ! I'll tell you afterwards.

end0finvention commented 6 years ago

Hi @ThGeoffrey . Have you made any progress on this?

ThGeoffrey commented 6 years ago

Hi @end0finvention, Unfortunately no... I ran out of time and failed to achieve the correct decode in implicit mode...

wphilips commented 6 years ago

During experiments with sf=8 and CR 4/5, implicit header and ldr on, I discovered that only the first three characters of the transmitted message are correct.

If I send a message with 15 zero bytes I also find that the dewhitened data is not all zeros, whereas the theory is that it should be all zeros because shuffling, gray coding and inteleaving does not affect such messages. This points to a different whitening sequence or to some other more mysterious problem.

I also performed experiments with messages having all zero bits, except in one specific position and looking at how the received symbols (bin_idx) change. It turns out that these messages are processed as follows: -the first 6 input bytes are jointly coded into 8 symbols, whereas the code currently processes only 5. This is quite weird of course, as there is no header. -subsequent input bytes are coded into groups of 5 symbols as expected (not sure about the details)

So this means that the deinterleaving must behave as if a header is present, even if there is none.

I can successfully decode the first six (so double as many as with the current code) bytes of message with the following changes:

I use a different whitening sequence (obtained by observing what is received for message containing all zero): [0xff, 0xff, 0x2d, 0xff, 0x78, 0xff, 0x30, 0x2e, 0x0, 0x2e, 0x12, 0x3c, 0x14, 0x28, 0xa, 0x30, 0x36, 0x0, 0x1e, 0x12, 0x2e, 0x14, 0x3c, 0xa, 0x28, 0x36, 0x30, 0x1e, 0x12, 0x2e, 0x0, 0x0, 0x0, 0x0, 0x24, 0x6] This sequence is the same as the one in the code in the first six entries, but then starts to differ.
I need to change the code so that the first deinterleave call for a message is called with 8 symbols, instead of 5.

The code that needs to be changed is the "4u + d_phdr.cr" below. In the first pass this should be replaced with 8. In the next passed probably it should remain as is, but I have not succeeded in decoding anything except the first 6 bytes.

       // Look for 4+cr symbols and stop
        if (d_words.size() == (4u + d_phdr.cr)) {
            // Deinterleave
            deinterleave((reduced_rate || d_sf > 10) ? d_sf - 2u : d_sf);
            return true; // Signal that a block is ready for decoding
        }

Note that d_phdr.cr is set to cr at startup and is not changed in implicit mode, so it is always 1 in this experiment.

wphilips commented 6 years ago

In the mean time, I have been able to create working code for decoding with all coding rates and implicit header on and low data rate on. This has been tested only with cr=7.

I also found a way to make reduced data rate work correctly in the case of an explicit header. I will create patches. Various small changes are needed.

During the experiments, I found that the original whitening sequence does not always produce all 0s when processing a message which has only zeros in it. However, in many (or all?) cases the two dewhitening sequences produce the same decoded results (but one may be more robust to noise than the other). There are still cases in which neither whitening sequence produces all 0s for an all zeros data payload. Forward error correcting fixes these errors properly.

As it seems most likely that only one whitening sequence is used, there could be a way to unify all of this.

rpp0 commented 6 years ago

That's great, thanks for your help @wphilips! I've noticed that the implementations vary across different devices, leading to some hardware even being unable to communicate with other hardware under certain configurations. So perhaps this also needs to be configurable at some point. Which device are you using for your experiments?

Reduced data rate should already work for explicit header mode though - at least my own device automatically enables reduced data rate in case sf > 10.

There may indeed be some bits that are incorrect in the current whitening sequence. I calculated it by sending messages with payload zero and averaging the result over about 100 packets, but due to inaccuracies in the demodulator it could be that errors consistently propagate to the dewhitening stage. The FFT-based demodulation approach could fix this, but as you probably know the exact algorithm used by the hardware is not decribed anywhere AFAIK.

wphilips commented 6 years ago

It would not surprise me if it is hardware dependent. The tests have been performed on hoperf rfm98w module in the 434MHz band. This module is attached to an esp32. Sometimes I do note quite noisy signals. I will test later on a loranexus device.

The code is ready. I will make a pull request.

It still has (at least) one problem left: the combination reduced_rate=off and implicit_header=on does not work. Other tested combinations seem to work, at least at sf=8 (I may have missed some)

I tested all combinations of cr=5,6,7,8 implicit=on/off, ldr=on/off sf=8, bw=125khz. The test signals alternate between a sequence of 255 zeros and 255 increasing numbers 0 1 2 3 ... 254 I captured one with a decoding error. Interestingly the same error was made in my fft implementation. So either this was interference, or an error in the fec implementation. Some test signals have quite a bit of interference. I have the impression this is caused by the esp32 because it disappears when i disconnect/reconnect the power (or it is a loose wire on the breadboard)

There may indeed be some bits that are incorrect in the current whitening sequence. I calculated it by sending messages with payload zero and averaging the result over about 100 packets, but due to inaccuracies in the demodulator it could be that errors consistently propagate to the dewhitening stage. The FFT-based demodulation approach could fix this, but as you probably know the exact algorithm used by the hardware is not decribed anywhere AFAIK.

I doubt it. When my fft decoder synchronizes properly, it produces exactly the same symbols as yours. Something else is going on: the whitening sequences which will be in my code are related somehow and it may be possible to replace them with a single one. I think there is a missing piece of the puzzle: at coding rates below (4,8) the deinterleaver now puts some zero bits in place of the missing checksum bits. When receiving a sequence of zeros, after dewhitening with a single sequence, the result is not always "all zero". My suspicion is that if you fill in the missing bits with something else (well chosen zeros or ones) it may work with a single whitening sequence. Guess work of course en in the end the easiest may be to just use different whitening codes.

What still puzzles me: the use of incorrect (?) whitening sequences still produces the correct decoded result because the fec is correcting errors. However, the question is: does this affect the resilience against real bit errors in the signal? In other words, is it bad to see a non-zero sequence of bits for an all-zero input, or can this be ignored?

wphilips commented 6 years ago

Test signals: see github.com:wphilips/lora-testsignals.git

wphilips commented 6 years ago

I have now fixed the errors for the compbination explicit_header without reduced rate.

Strangely I find for a few signals that one or two out 254 bytes have a single bit error. these bit errors are consistent (always the same when transmitting the same data) This is an example with the erroneous byte marked by [] [55] should be [17] 2 bit errors [14] should be [1c] 1 bit error This is for 20180918_125khz_sf8_cr6_implicit_noldr_zero_ramp255_multiple.dat 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14 15 16 [55] 18 19 1a 1b [14] 1d 1e 1f 20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33 34 35 36 37 38 39 3a 3b 3c 3d 3e 3f 40 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f 50 51 52 53 54 55 56 57 58 59 5a 5b 5c 5d 5e 5f 60 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 75 76 77 78 79 7a 7b 7c 7d 7e 7f 80 81 82 83 84 85 86 87 88 89 8a 8b 8c 8d 8e 8f 90 91 92 93 94 95 96 97 98 99 9a 9b 9c 9d 9e 9f a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 aa ab ac ad ae af b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba bb bc bd be bf c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc cd ce cf d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 da db dc dd de df e0 e1 e2 e3 e4 e5 e6 e7 e8 e9 ea eb ec ed ee ef f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 fa fb fc fd fe 0f 65 00 00

When repeating the test with a signal10 11 12 .. The errors remain in the same position. So they are not value dependent.

These errors only occur for coding rate (4.6) and (4,8) code can be found on my git repo

Test signals can be found here: https://github.com/wphilips/lora-testsignals

.

rpp0 / gr-lora

How to decode messages in implicit mode (with CR different from CR4/8) ? #65