quiet / libcorrect

C library for Convolutional codes and Reed-Solomon
BSD 3-Clause "New" or "Revised" License
379 stars 94 forks source link

Convolutional encoder output length #36

Open dariapreda opened 4 years ago

dariapreda commented 4 years ago

Hello and thanks to the autors for this library.

I am new to convolutional codes and I tried to understand the mechanism behind in order to use this library. So far, after reading the basics of encoding, I understood that the output should be twice the length of the input (in case of rate 1/2).
My questions is related to this function in encoder.c:

size_t correct_convolutional_encode_len(correct_convolutional *conv, size_t msg_len) {
    size_t msgbits = 8 * msg_len;
    size_t encodedbits = conv->rate * (msgbits + conv->order + 1);
    return encodedbits;
}

I don't understand the formula to calculate the encodedbits. Shouldn't it be something like msgbits*conv->rate ? Why do we send so many bits? Could you please explain? It would be very helpful for me.

Thank you in advance, Daria

brian-armstrong commented 4 years ago

Hi Daria,

It might be helpful to look at the encoder implementation to see what's happening here. Basically, we need the extra bits because of a flush op sent after the message to return the shift register state to all 0s. Most of the length is indeed given by msgbits rate, and the flushed bits are given by (order + 1) rate.

The relevant loops are here https://github.com/quiet/libcorrect/blob/master/src/convolutional/encode.c#L47

Cheers

dariapreda commented 4 years ago

Thank you, now it makes sense after reading those lines.

And one more question: firstly I've tried to do some tests in Matlab and then to use libcorrect but I can't figure out why the outputs are different. I would like to use libcorrect in a project which requires the output to be exactly like in theory (to double or triple the number of bits depending on the rate). Is this possible with the current API and implementation? I've tried to inspect this line here unsigned int out = conv->table[shiftregister]; and go deeper on how the table is made. I suppose that the table is related to the shift register states and calculate the output based on the polynom and history but I'm a little lost.

Thank you again for helping me :)

brian-armstrong commented 4 years ago

Hmm, I'm actually not sure how the number of output bits ever could be just m * r rather than (m + k) * r for m message bits, r rate and k conv shift register size. In the most extreme case of m = 1, r = 2, k = 7, we only ever get the shift register state from the one bit's contributions in one location. One of the aspects of these convolutional codes that makes them good at error recovery is that one message bit's contribution is sort of smeared out over multiple output bits. Without running this bit through all 7 states, we'd have no maximum-likelihood calculation we could make based on the received bits. At best we can only really say that the 2 r bits either agree on a message bit or disagree.

I don't have access to Matlab so I'm not familiar with its implementation so my best guess is that maybe it's omitting either the first k bits or the last k bits. This reduces the level of correction at either the beginning or the end of the message, but I believe it will still work if the decoder expects it.

The line in question is fetching precomputed, concatenated polynomial outputs from a shift register value. The table is filled by https://github.com/quiet/libcorrect/blob/master/src/convolutional/lookup.c#L14 . It simulates each possible shift register value into i, bitwise ands it with each polynomial pattern, and then calculates the parity of this bitwise and with popcnt() % 2. For k = 3, poly = b011 and shift register state b101, this would be 011 & 101 -> 001, parity(001) = 1. If the second poly was b101 then 101 & 101 -> 101, parity(101) = 0. These two parity values are then concatenated. So the value of conv->table at b101 would be 01. This allows us to skip needing to recompute the polynomial and parity bits during encoding, we can just generate output bits directly from the shift register state.