ArduPilot / SiK

Tools and firmware for the Si1000
BSD 2-Clause "Simplified" License
284 stars 232 forks source link

Golay codec optimisation for space and speed (esp. decoding) #16

Closed tomscii closed 10 years ago

tomscii commented 10 years ago

The Golay23 codec in SiK/radio could use some optimisations. An improved version is proposed in this commit. Changes to the upstream version, with observations motivating those changes, are outlined below.

  1. Code table width

The tables used for encoding and decoding are 32 bits wide, which is a complete waste of ROM space. We are only interested in recovering the 12-bit payload data -- we don't really care about whether errors that fall into the parity symbol part of the codeword get corrected or not. We throw that part away anyway.

So only 11 bits of the encoder table are significant to us -- 11 bits is the width of the syndrome that, combined with our 12-bit payload, will form the 23 bits of the Golay23 codeword. For easy storage, we store these values in a 16 bit wide table. The table still contains 4096 values, but requires half the storage space.

Naturally, the decoder table also can be restricted to the meaningful part. In this case, the table contains the error correction lookup values of which 12 bits are useful -- the width of our payload we wish to correct. For easy storage, we store these also in a 16 bit wide table. The table still contains 2048 values, but requires half the storage space.

  1. Decoding algorithm

That was trivial, you might say. Now comes the interesting part. The decoding of Golay23 codewords can be done in much less work compared to the existing implementation. Specifically, the whole syndrome calculation function is redundant. Why calculate the syndrome on the spot each time, when it is already stored in a table?

Realise that the encoder table is nothing but precomputed values of the syndrome for all possible payload values. The encoding operation is in fact nothing more but a lookup to obtain the syndrome value corresponding to the payload at hand, and appending it to obtain the encoded codeword. The exact same operation is usable to obtain the syndrome when decoding. The difference, of course, is that the received payload (the part of the received codeword that is the payload) may contain some bit errors. Nevertheless, we look it up in the encoder table to obtain the syndrome corresponding to that payload, were it the real payload that was sent (and if there is no error, it is).

Since the operations over the code space are linear, XORing this looked-up syndrome with the received parity will yield the real received syndrome. The same as the expensive syndrome calculator function would have yielded. That is, we traded a function call with a loop over individual bits for a cheap table lookup and an XOR.

  1. Re-arrange the bytes

Since the Golay codec in the SiK firmware always works in multiples of 3 bytes for payload (and mutiples of 6 bytes for encoded data), it is natural that we can rearrange the internal order of bytes within these 3-packs and 6-packs to shave off some bit-twiddling operations.

Specifically, the first 3 bytes of an encoded 6-pack are the same as the 3 bytes input to golay_encode(). The next three bytes contain the parity information:

I don't have access to the real hardware so I cannot measure the real-time improvement these changes yield. I estimate the time required to run golay_decode24() to be about one third of the original.

I did verify the codec implementation driving it via a host program (compiled with GCC on Linux) acting as a testbench: generating random payload, encoding, applying errors, decoding and verifying that the original payload was recovered. In case of interest, I shall be happy to supply this testbench application.

tomscii commented 10 years ago

before: Stack starts at: 0x8c (sp set to 0x8b) with 116 bytes available. Other memory: Name Start End Size Max PAGED EXT. RAM 0x0000 0x00ed 238 256 EXTERNAL RAM 0x00ee 0x0ffb 3854 4096 ROM/EPROM/FLASH 0x0000 0xdab6 54969 62464

after: Stack starts at: 0x7d (sp set to 0x7c) with 131 bytes available. Other memory: Name Start End Size Max PAGED EXT. RAM 0x0000 0x00e8 233 256 EXTERNAL RAM 0x00e9 0x0ff6 3854 4096 ROM/EPROM/FLASH 0x0000 0xa8fc 42239 62464

edited for clarity

geeksville commented 10 years ago

I haven't looked line by line at the code, but the write-up looks great. (My biggest concern was to make sure ram usage did not go up)

tridge commented 10 years ago

Hi Tom, This is indeed a very impressive bit of work, but unfortunately it is incompatible with the existing golay code. I loaded a firmware with your patch one one radio and without the patch on the other and the two radios would no longer communicate. We really need to retain this compatibility, as users will often only upgrade one end of their link. Cheers, Tridge

tridge commented 10 years ago

I'll close this for now, but I'd be happy to re-open this patch once we have a version that is compatible with existing firmware. Cheers, Tridge

tomscii commented 10 years ago

Here you are: https://github.com/tridge/SiK/pull/17

Thanks, Tom