daniestevez / free-outernet

Free Software Outernet receiver
GNU General Public License v3.0
77 stars 23 forks source link

LDPC decoder absent #1

Closed bistromath closed 7 years ago

bistromath commented 7 years ago

Putting this here with the intention to help you implement LDPC decoding and as a place to discuss it. For starters, what do we know about the possible LDPC codes used for Outernet? We know it's a systematic code. Outside of that, do you have a lead on what codes are used? Do you know what rates are supported (this should be easy to determine from received packets)?

LDPC is new enough that there aren't many standard codes used. The block size may go a long way toward determining what code is used. Worst case, it is a randomly (or algorithmically-- i.e., PEG) generated code, which may be difficult or impossible to reverse.'

IT++ has a good LDPC decoder implementation which is flexible enough to use with basically anything. It is GPLv3. Mackay's site includes source code for a decoder which can be used in a pinch. That software license is free-with-credit.

bistromath commented 7 years ago

OK. After reading further (should have done that up front) it looks like a 2096,1936 systematic LDPC code. I'm unable to find a standard code out there with that block size, so I'll do some further thinking about this, including thinking about how to reverse the generator matrix from the parity bits. The block size is small enough that it may not be totally unfeasible.

bistromath commented 7 years ago

Am I misunderstanding your writeup on the packet structure? Is the FEC data coming in framed in its own frame type, also in the same 242-byte packets, separate from the corresponding file data?

daniestevez commented 7 years ago

Good idea. Let's use this thread to gather all the information about FEC.

Currently the code prints out some debug information regarding FEC every time that a file is received. Here is the relevant output for the recordings I have https://gist.github.com/daniestevez/57e8156199bfc22748534e58349b8d5a

The ldpc:k=... line is included in the XML corresponding to the file announcement. Since there is a "seed" parameter, I would guess that the LDPC code is generated randomly (at least up to some extent). One can see that different n and k parameters are used for each file (they're probably selected according to the file size before transmission).

The "Length of FEC data" line contains the size of FEC data that was received for that particular file, as well as the size of the file.

Your understanding of the packet structure is correct. FEC data is sent in 242 blocks, separate from the corresponding file data. What I find weird is that all the blocks are exactly 242 bytes long (for file data the last block is usually shorter, because the file size is usually not a multiple of 242 bytes). In this line of thought, you can try to divide the size of FEC data by the file size. You'll see that it yields a slightly different quotient each time (expected, because the numerator is always a multiple of 242 and the denominator is not), and it is also higher than n/k-1 (which would be what you would get if you use an (n,k) linear code by blocks, without padding). This indicates that some padding is used somewhere, to get FEC data that is always a multiple of 242. I have no idea about the precise details.

bistromath commented 7 years ago

Fascinating.

I've never seen an LDPC implementation using variable block sizes at a constant rate, or one using randomly-generated generator matrices. There are some disadvantages to this scheme which make me wonder what the advantages are, since I can't immediately see them. In particular, a randomly-generated matrix is suboptimal in performance, and there's some small but finite chance that it's fatally compromised in its ability to correct errors. Also it's usually pretty expensive to generate good (i.e., not compromised) LDPC generator matrices, so doing this "on the fly" seems strange. And lastly, there has to be some sort of cap on the block size, as that will determine the memory and CPU requirements of the decoder (or gate count, if the decoder were to be done in programmable logic).

Usually you just use a big-ass block size, as big as you can get away with, because longer blocks == better code performance. You could pick a few of them to get a good balance between padding overhead and code performance for smaller files. But a fully variable block size? Strange. But we'll run with it.

The rate is fixed at 5/6 (0.833), or as close as possible to that rate for the selected code. There are 242 LDPC blocks to each complete file. What do you suppose the significance of the number 242 is here, if any? Is it just a design coincidence that there are 242 blocks to each file, and that the MAC layer packets are 242 bytes each?

I am assuming that the files are padded to a multiple of the block size in order to feed the LDPC encoder, and that the padding bytes are simply not sent. The padding will be reapplied before the LDPC decoder.

Interestingly, N and K appear to be specified in bytes rather than bits, which is unusual. It looks like the process for choosing a block size is:

  1. K = CEIL(filesize/242) (in bytes)
  2. N = CEIL(K*(6/5)) (in bytes)

This ensures there are 242 blocks to each file, and it ensures the rate is <= 5/6. Let's apply this to one file from your gist:

FEC debug info for file opaks/ed57-Amazon.com.html.tbz2 (FEC decoding not implemented yet) ldpc:k=852,n=1023,N1=2,seed=1000 Length of FEC data: 41140 bytes; File size: 206080 bytes

  1. K = CEIL(206080/242) = 852
  2. N = CEIL(852*(6/5)) = 1023

The file will have (852242-206080) = 138 bytes of padding before feeding the LDPC decoder. The FEC data size will be (242(1023-852)) = 41382.

Wait, that last number is wrong! You received 41140 bytes, or 170*242, but the parity data should be (1023-852)=171 bytes long, which implies 41382 bytes of parity data in 242 packets. What gives? Where's the missing 242 bytes of parity?

bistromath commented 7 years ago

You know, IT++ includes a default generator in its LDPC generator matrix class which takes a seed as a parameter. If I were designing this wacky scheme I'd probably start with their codec, since it's free and it's good. Can you get me binaries for the data and FEC frames that I can work with to try this approach?

bistromath commented 7 years ago

If they used IT++ then they'd link it in ondd, and I don't see that. I do see all the stuff that we're interested in seems to be implemented in carousel.cpp, in particular this symbol:

_ZNK8carousel15fec_init_matrixERSt6vectorISt4listItSaItEESaIS3_EERKNS_12fec_params_tE

..which demangles to:

carousel::fec_init_matrix(std::vector<std::list<unsigned short, std::allocator >, std::allocator<std::list<unsigned short, std::allocator > > >&, carousel::fec_params_t const&) const

My reversing skills are super rusty but that's probably where to start.

daniestevez commented 7 years ago

I wouldn't assume that the implementation that Outernet is doing of FEC is optimal. I've encountered other parts of their software which include not-so-smart design choices. My guess is that all of this was programmed by someone who doesn't know very well what he is doing and hacked together a bunch of technologies. I will approach this with an open mind, as we can find unexpected design choices.

In particular, I wondered how I would implement file FEC in Outernet if I had to design it from scratch. I wouldn't use LDPC codes (perhaps because I'm not so familiar with them). I would use Reed-Solomon instead, because it is an MDS code. An algebraic decoder for RS can correct up to n-k erasures, which is the maximum that you can hope to correct in general, according to the Singleton bound. I don't know what advantages can LDPC codes give over RS. In fact, if the LDPC code is generated pseudorandomly using the seed, this can produce not very strong codes (read as able to correct only much less than n-k erasures), as you point out.

Interestingly, N and K appear to be specified in bytes rather than bits, which is unusual.

It is possible that the LDPC codes work over GF(256) instead of GF(2), so they work on bytes instead of bits. I will assume here that this is the case. It is also possible that they are codes over GF(2) but they only work on blocks whose size is a multiple of 8 bits (this is less restrictive than the condition that the code is linear over GF(256)).

What do you suppose the significance of the number 242 is here, if any?

The number 242 only comes from the fact that the files are transmitted in chunks which are 242 bytes long. Adding the necessary headers and transmitting at 2100bps, this produces packets which are approximately 1 second long. This seems like a round number and a sensible design choice, so I think that 242 only comes from this choice.

Now, the important part that we haven't treated yet is interleaving. I'm pretty sure that they are using interleaving. If you don't use interleaving, when you lose a packet you lose 242 bytes in a single LDPC block and you're pretty much screwed (too many erasures on this block). If you do use interleaving, the loss gets spread over many LDPC blocks and you can still decode, as all the blocks have only few erasures.

Given your observation on the choices of n and k, I think that they're doing the following. Given a file of size N, first pad the file by adding bytes to make its size a multiple of 242 bytes, so we can assume that N is a multiple of 242 bytes. The file is to be sent in N/242 chunks of 242 bytes each (except for the fact that padding is not really sent). We do interleaving as follows: the first block for LDPC coding is formed by the first byte of each of the file chunks, the second block for LDPC is formed by the second byte of each of the file chunks and so on. We have a total of 242 LDPC blocks and their size is k = N/242 (which is exactly what you got). Now n is chosen so that r=k/n is approximately 5/6 (or something along these lines). For each LDPC block you get n-k parity check bytes. These are interleaved. The first chunk of FEC data contains the first parity check bytes of all the 242 LDPC blocks, the second chunk of FEC data contains the second parity check byte of all the 242 LDPC blocks and so on. A total of n-k FEC chunks of 242 bytes each are sent.

The only thing that doesn't match in this description is that, as you have noticed, according to the debug info printed by free-outernet, only n-k-1 FEC chunks are sent. Probably I'm missing the last one in the code or something like that (The code is a bit optimistic in trying to recover all the FEC blocks, since I didn't know in advance how many blocks to expect. From the top of my head, it seems that if the last FEC chunk is transmitted after the last file chunk, then free-outernet will miss it. I'll check the code later, as this is most likely the problem here). It's also possible that for some reason one of the LDPC parity check bytes is not sent at all (as some sort of puncturing), but this doesn't make much sense.

Can you get me binaries for the data and FEC frames that I can work with to try this approach?

I'll send you the KISS recordings I have. You can process them with free-outernet to extract the binary data. Can you give me an email address or something?

If they used IT++ then they'd link it in ondd, and I don't see that. I do see all the stuff that we're interested in seems to be implemented in carousel.cpp, in particular this symbol:

I think that "carousel" is their do-everything main class for ondd. I wouldn't be surprised if they have taken the code they need for LDPC from an open source library and just copy-pasted it into carousel. If they have used IT++, which is only released under the GPL, this would break the GPL, so that's a reason to copy-paste the code instead of linking the library (to hide the fact that they're using GPL code). It wouldn't be the first time that Outernet breaks the GPL: they are linking librtlsdr and libmirisdr (which are GPL only) in their sdr100 closed-source binary.

daniestevez commented 7 years ago

I've checked and I'm not missing any of the FEC chunks in free-outernet (as long as the packet has the same format as the rest, which is to be expected). It is perfectly possible that their transmit software is buggy and it doesn't send the last FEC chunk. Since the FEC decoder needs to account for the possibility that some FEC chunks are lost, everything will still work "fine" even if the last FEC chunk is never sent.

smunaut commented 7 years ago

Just as a quite note, to reverse FEC codes I've used M4RI a few times. You can just see the FEC as a matrix multiplication introducing redudancy and by looking for linear dependent rows you can get how the parity bits are constructed.

daniestevez commented 7 years ago

I'm not sure if I understand what you say: if you have an (n,k) linear code and n linearly independent codewords, then this set of codewords is a basis for the linear code and you can obtain from it a generator matrix or whatever you want. Is this what you mean?

The problem here is that every file uses a different LDPC code (with different n and k, in fact). And we only get 242 codewords for each file. So it's almost impossible to get n linearly independent codewords, except for the smallest codes (say, those used for files which are 50kB or smaller).

Even if you can get the generator matrix for one of these codes used for smaller files, you don't get much, as we expect that the LDPC codes are generated pseudorandomly (due to the "seed" parameter).

Speaking of this, I have looked at the ondd binary and I haven't seen any calls to random(), rand() or a similar function.

bistromath commented 7 years ago

I wouldn't assume that the implementation that Outernet is doing of FEC is optimal. I've encountered other parts of their software which include not-so-smart design choices. My guess is that all of this was programmed by someone who doesn't know very well what he is doing and hacked together a bunch of technologies. I will approach this with an open mind, as we can find unexpected design choices.

In particular, I wondered how I would implement file FEC in Outernet if I had to design it from scratch. I wouldn't use LDPC codes (perhaps because I'm not so familiar with them). I would use Reed-Solomon instead, because it is an MDS code. An algebraic decoder for RS can correct up to n-k erasures, which is the maximum that you can hope to correct in general, according to the Singleton bound. I don't know what advantages can LDPC codes give over RS. In fact, if the LDPC code is generated pseudorandomly using the seed, this can produce not very strong codes (read as able to correct only much less than n-k erasures), as you point out.

Unlike classical algebraic codes (RS, for example) LDPC can (subject to probability) decode much more than n-k errors. However, LDPC is in fact a strange choice here, because it's applied at the bit level instead of the symbol level, as would ordinarily be done. You usually use LDPC on soft symbols (as the "inner code"), using the log-likelihood ratio of the symbol (given an estimated SNR) as the input metric to the decoder. In this application the correct place for LDPC would be in place of the r=1/2, k=7 CCSDS convolutional code. Operating on hard bits works but is very compromised from a performance standpoint and gives up a lot of the reason to use LDPC in the first place. I agree RS or BCH would be the better approach as an outer code. I suspect the CCSDS code was chosen due to the chosen modem's support for it.

Interestingly, N and K appear to be specified in bytes rather than bits, which is unusual.

It is possible that the LDPC codes work over GF(256) instead of GF(2), so they work on bytes instead of bits. I will assume here that this is the case. It is also possible that they are codes over GF(2) but they only work on blocks whose size is a multiple of 8 bits (this is less restrictive than the condition that the code is linear over GF(256)).

While I suppose this is possible there is virtually no literature on non-binary LDPC, and no (AFAIK) implementations in the wild. The latter supposition makes a lot more sense to me.

What do you suppose the significance of the number 242 is here, if any?

The number 242 only comes from the fact that the files are transmitted in chunks which are 242 bytes long. Adding the necessary headers and transmitting at 2100bps, this produces packets which are approximately 1 second long. This seems like a round number and a sensible design choice, so I think that 242 only comes from this choice.

Now, the important part that we haven't treated yet is interleaving. I'm pretty sure that they are using interleaving. If you don't use interleaving, when you lose a packet you lose 242 bytes in a single LDPC block and you're pretty much screwed (too many erasures on this block). If you do use interleaving, the loss gets spread over many LDPC blocks and you can still decode, as all the blocks have only few erasures.

Aha! Interleaving! I hadn't considered that. But from a performance standpoint, it makes very little sense to interleave only the parity bits. The data bits and the parity check bits are equivalent from an information-carrying standpoint, so why interleave the parity bits and not the data bits? 5/6 of your bits will still have no interleaving. You will still require all (or nearly all) 242 data packets to decode an LDPC block.

However, it seems to me this is probably the path they took -- it makes sense in light of two things. First, it explains the use of 242-block files, and second, it makes sense if we allow that Outernet is making strange design decisions.

Given your observation on the choices of n and k, I think that they're doing the following. Given a file of size N, first pad the file by adding bytes to make its size a multiple of 242 bytes, so we can assume that N is a multiple of 242 bytes. The file is to be sent in N/242 chunks of 242 bytes each (except for the fact that padding is not really sent). We do interleaving as follows: the first block for LDPC coding is formed by the first byte of each of the file chunks, the second block for LDPC is formed by the second byte of each of the file chunks and so on. We have a total of 242 LDPC blocks and their size is k = N/242 (which is exactly what you got). Now n is chosen so that r=k/n is approximately 5/6 (or something along these lines). For each LDPC block you get n-k parity check bytes. These are interleaved. The first chunk of FEC data contains the first parity check bytes of all the 242 LDPC blocks, the second chunk of FEC data contains the second parity check byte of all the 242 LDPC blocks and so on. A total of n-k FEC chunks of 242 bytes each are sent.

This adds up, at least mathematically.

The only thing that doesn't match in this description is that, as you have noticed, according to the debug info printed by free-outernet, only n-k-1 FEC chunks are sent. Probably I'm missing the last one in the code or something like that (The code is a bit optimistic in trying to recover all the FEC blocks, since I didn't know in advance how many blocks to expect. From the top of my head, it seems that if the last FEC chunk is transmitted after the last file chunk, then free-outernet will miss it. I'll check the code later, as this is most likely the problem here). It's also possible that for some reason one of the LDPC parity check bytes is not sent at all (as some sort of puncturing), but this doesn't make much sense.

Can you get me binaries for the data and FEC frames that I can work with to try this approach?

I'll send you the KISS recordings I have. You can process them with free-outernet to extract the binary data. Can you give me an email address or something?

bistromath@gmail.com

If they used IT++ then they'd link it in ondd, and I don't see that. I do see all the stuff that we're interested in seems to be implemented in carousel.cpp, in particular this symbol:

I think that "carousel" is their do-everything main class for ondd. I wouldn't be surprised if they have taken the code they need for LDPC from an open source library and just copy-pasted it into carousel. If they have used IT++, which is only released under the GPL, this would break the GPL, so that's a reason to copy-paste the code instead of linking the library (to hide the fact that they're using GPL code). It wouldn't be the first time that Outernet breaks the GPL: they are linking librtlsdr and libmirisdr (which are GPL only) in their sdr100 closed-source binary.

I don't see any of the same symbol names as used in IT++. It's a very hierarchical library, and if it were copy-pasted into ondd surely some of their namespaces would show up.

bistromath commented 7 years ago

Speaking of this, I have looked at the ondd binary and I haven't seen any calls to random(), rand() or a similar function.

It can't be truly randomly-generated, of course, nor depend on random()/rand() -- the sender and receiver must both generate the same generator and parity-check matrices. So, if they were calling out to random()/rand() with a given seed, they'd be depending on that implementation not to change on any platform running ondd. More likely the implementation of the generator algorithm is entirely within ondd.

daniestevez commented 7 years ago

I suspect the CCSDS code was chosen due to the chosen modem's support for it.

Actually the uplink modem supports a lot of options for FEC (incuding LDPC). http://datumsystems.com/m7 They're only using most basic options it provides (probably because they need to implement a corresponding FEC decoder in their receiver software).

While I suppose this is possible there is virtually no literature on non-binary LDPC, and no (AFAIK) implementations in the wild.

I'm not very familiar with LDPC codes, but at least there is this http://microtelecom.it/qracodes/QRACodes-Rev10.pdf which is LDPC code over GF(64). An implementation of this code is working very well and it is starting to gain popularity among the Amateur Radio comunity.

However, I also think that it's more likely that they use a binary LDPC code with the restriction that n and k are multiples of 8 bits.

Aha! Interleaving! I hadn't considered that. But from a performance standpoint, it makes very little sense to interleave only the parity bits. The data bits and the parity check bits are equivalent from an information-carrying standpoint, so why interleave the parity bits and not the data bits? 5/6 of your bits will still have no interleaving. You will still require all (or nearly all) 242 data packets to decode an LDPC block.

Read again my description of how I suspect that they are using interleaving. Interleaving is used both for file data and parity check bits. If you miss one packet (be it a packet with a chunk of file data or a packet with a chunk of parity check bits), you would only lose 1 byte in each of the 242 LDPC blocks.

Perhaps it's more useful to explain this "graphically" using a matrix: take your file, pad it so the size is a multiple of 242 bytes and write it as a k x 242 matrix (as you would normally write text, going from left to right and jumping to the next row every 242 bytes). Now take each of the columns of this matrix as an LDPC block and compute the n-k corresponding parity check bytes. Write these bytes as a column below the corresponding column, so you end up with an n x k matrix. The first k rows are your file, while the last n-k rows are parity check bytes. Now transmit this matrix by rows, each row as a packet with 242 bytes of payload. The LDPC blocks are the columns of the matrix. If you miss one packet (i.e., one row), you only miss one entry in each column.

I don't see any of the same symbol names as used in IT++. It's a very hierarchical library, and if it were copy-pasted into ondd surely some of their namespaces would show up.

That's true. I still expect that they have copy-pasted the LDPC stuff from some existing library (be it GPL or other more permissive licence). This is not something that anyone can implement from scratch.

Also, and this is just a hunch, the parameters for the LDPC code are indicated using something like this "ldpc:k=852,n=1023,N1=2,seed=1000", while the rest of the parameters of the file (size, hash, name, etc.) are indicated using XML. I think that the implementation of LDPC they're using works with strings using that format, since otherwise it would be more natural to include the FEC parameters in the XML.

So, if they were calling out to random()/rand() with a given seed, they'd be depending on that implementation not to change on any platform running ondd.

Fair enough. I hadn't thought of that.

bistromath commented 7 years ago

While I suppose this is possible there is virtually no literature on non-binary LDPC, and no (AFAIK) implementations in the wild.

I'm not very familiar with LDPC codes, but at least there is this http://microtelecom.it/qracodes/QRACodes-Rev10.pdf which is LDPC code over GF(64). An implementation of this code is working very well and it is starting to gain popularity among the Amateur Radio comunity.

Neat! Hadn't seen that before.

Aha! Interleaving! I hadn't considered that. But from a performance standpoint, it makes very little sense to interleave only the parity bits. The data bits and the parity check bits are equivalent from an information-carrying standpoint, so why interleave the parity bits and not the data bits? 5/6 of your bits will still have no interleaving. You will still require all (or nearly all) 242 data packets to decode an LDPC block.

Read again my description of how I suspect that they are using interleaving. Interleaving is used both for file data and parity check bits. If you miss one packet (be it a packet with a chunk of file data or a packet with a chunk of parity check bits), you would only lose 1 byte in each of the 242 LDPC blocks.

Wait, are you saying you already know that file data is interleaved in the same way? If so, that totally explains it and everything makes sense! I didn't get the understanding from your reverse engineering description of the protocol that deinterleaving the file data was required.

daniestevez commented 7 years ago

Wait, are you saying you already know that file data is interleaved in the same way? If so, that totally explains it and everything makes sense! I didn't get the understanding from your reverse engineering description of the protocol that deinterleaving the file data was required.

I'm not sure. I'm just quite confident that it must work that way, as that's the only way that makes sense to me.

The only thing I'm sure of so far is that the file is transmitted in 242 byte chunks (following the usual order, the first chunk contains bytes 0 through 241 and so on), except for the last chunk which may be shorter. Some other binary data (presumably the LDPC parity check bits) is also transmitted in 242 byte chunks.

So, if you get all the file chunks, then there is not need to do deinterleaving or LDPC decoding. You just concatenate all the chunks and there's your complete file. That's what I knew when I wrote my reverse engineering blogpost.

Since writing that post, I've thought about interleaving and I'm quite confident (but not totally sure) that it must work as I've described above. So if you don't get all the file chunks, you must interleave the file chunks and the parity check chunks to get the 242 LDPC codewords (with some erasures), feed each of these 242 codewords into the LDPC decoder, and then deinterleave the result again to obtain the complete file.

bistromath commented 7 years ago

I understand what you're getting at now. It's clever, and it explains the "242" thing.

bistromath commented 7 years ago

I now believe the LDPC implementation is most likely to be the one given in RFC 5170, an LDPC-staircase implementation which is fully specified (including the PRNG). The parameters for a given codec are k, n, N1, and seed. Because the implementation is fully specified the particular implementation used may not be important. OpenFEC would be a good start. Radford Neal's code as well as INRIA's is all (as Mackay's examples above) patent-free and license-free.

daniestevez commented 7 years ago

I haven't read the RFC carefully, but it's quite probable that you're right. I would try to check if this is the LDPC code used in Outernet by doing the following:

  1. Get a generator matrix for an LDPC code with the parameters corresponding to some file. There is some code to generate parity check matrices in the RFC.
  2. Extract the 242 LDPC codewords corresponding to some file using free-outernet.
  3. Calculate the parity bytes for each codeword using the generator matrix. Check if these match the parity bytes in the codeword.

Note that it seems that the last chunk of LDPC check bytes is not being sent, so the last byte of each LDPC codeword is unknown.

daniestevez commented 7 years ago

I'm looking at the sample code in RFC 5170. It uses the functions matrix_has_entry() and matrix_insert_entry() there are calls to functions with these names in the FEC code of the ondd binary. I think it's extremely probable that the FEC implementation in ondd is just taken from this RFC.

bistromath commented 7 years ago

Well, this is all very convenient. If it's RFC5170 then the openFEC library will decode it more or less out of the box, and I've worked with Radford Neal's code quite a bit before so the under-the-hood implementation is familiar. I'll get to trying it.

daniestevez commented 7 years ago

I've done a test implementation using RFC 5170. It's on the ldpc branch of free-outernet. I'm trying to get the 1st LDPC codeword for some file, generate the parity check matrix for the correspoding LDPC code, compute the parity check bytes and check if they match those on the codeword. However, it doesn't match. Perhaps I'm doing something stupid. I could use another pair of eyes to check if what I'm doing is correct.

See a transcript of the test here: https://gist.github.com/daniestevez/259cb53a8e85692c93e8192fa053e338

The data and fec arrays have being obtained from free-outernet.py and they correspond to the first LDPC codeword for the file opaks/dad7-Alt-right.html.tbz2

I'm also looking at the disassambled ondd binary to see if it implements the RFC.

daniestevez commented 7 years ago

I've checked the implementation I've done of ldpc.py in the ldpc branch of free-outernet by comparing with the implementation here. They generate the same parity check matrix.

I've also checked that the PRNG in the ondd binary is the same as specified in the RFC. It's actually the same implementation that in the open-source implementation I've linked in the paragraph above, which is curious because this implementation is quite unnatural (although probably faster on some machines).

In this open source implementation there are actually two types of LDPC codes (apart from LDGM/Staircase/Triangle choices for the parity check bytes side of the parity check matrix): Evencol and Evenboth. The RFC is the same as Evenboth. I've also checked if the Outernet LDPC matches Evencol, and it doesn't.

daniestevez commented 7 years ago

Now I'm looking at FLUTE, which is a standard to broadcast files over an unidirectional channel. Apparently it uses the kind of LDPC codes that we're discussing. It also has some component called carousel, which is also the name of the main component on ondd. Coincidence? I don't think so...

daniestevez commented 7 years ago

It seems that Outernet doesn't use exactly FLUTE. Perhaps some similar protocol. The term carousel is also used on DSM-CC, which is used to send content over DVB. Before using L-band, Outernet used DVB-S satellites in the Ku-band, so perhaps they were using DSM-CC, as the data was transmitted in a DVB-S multiplex. It doesn't seem that they're using DSM-CC right now, because the file metadata are sent in binary in DSM-CC and in XML in Outernet. Perhaps they're using something more simple based on DSM-CC, as they might have some legacy software based on their Ku-band times.

In any case, let's get back to the LDPCs and concentrate on reverse engineering the generation of the parity check matrix.

bistromath commented 7 years ago

Really confused by your method of generating a parity check matrix in ldpc.py. The matrix representation used in the implementation you listed is Radford Neal's sparse matrix format mod2sparse -- insertion is inserting ones at given offsets in a sparse, empty binary matrix. mod2sparse internally represents the sparse matrix as a list of offsets that have ones in them. The matrix you've generated has the offsets in it, but you can't directly apply those offsets to the encoding process.

I'm going to just recreate it in an existing sparse matrix library.

bistromath commented 7 years ago

OK, I kind of see what you're doing. I misunderstood. I think that works?

daniestevez commented 7 years ago

Just in case anyone else is reading this, the format to represent the sparse matrix in ldpc.py is as follows. The entries of the matrix are 0's or 1's. A row of this matrix is represented by a python list of the indices of the columns in which this row has a 1. For instance, [2, 31, 50] is a row which has 1's at columns 2, 31 and 50. The whole matrix is represented as a list of all the rows, so you get something as [[2, 31, 50], [0, 1, 17, 34]].

bistromath commented 7 years ago

OK, I finally had some time to look at this again. I've verified your finding that the Python snippet's results match up with the "ldpc_v2.1" library's, although that library takes a slightly different approach by putting the check bits in the left side of the pchk matrix. Same same though.

Trying to puzzle out how that library handles encoding. It's a little opaque -- it's genericized to the point of unreadability, at least to me.

daniestevez commented 7 years ago

LDPC decoding is now implemented. See this pull request. Thus, closing this.