sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.36k stars 462 forks source link

Decoding general linear codes with Groebner bases #11413

Open 3c150393-ad85-4f5b-9939-4ebab97b47ee opened 13 years ago

3c150393-ad85-4f5b-9939-4ebab97b47ee commented 13 years ago

I have implemented a decoding method for general linear codes in Sage. The method decodes up to half the true minimum distance using Groebner bases. This method was introduced by Bulygin and Pellikaan.

I was expecting the method to be faster than syndrome decoding, but it appears to be equally fast. It may be worth having this method around in Sage since Groebner basis computation may become faster. I have attached a report with my findings.

Apply:

CC: @sagetrac-sbulygin @kwankyu

Component: coding theory

Keywords: general decoding groebner basis

Author: Niels Duif

Reviewer: Burcin Erocal

Issue created by migration from https://trac.sagemath.org/ticket/11413

3c150393-ad85-4f5b-9939-4ebab97b47ee commented 13 years ago

Decoding with Groebner bases

3c150393-ad85-4f5b-9939-4ebab97b47ee commented 13 years ago
comment:1

Attachment: report.pdf.gz

Here is my implementation. Please review!

3c150393-ad85-4f5b-9939-4ebab97b47ee commented 13 years ago

Attachment: decoding_GB.patch.gz

burcin commented 13 years ago

Reviewer: Burcin Erocal

burcin commented 13 years ago
comment:2

Thanks for the patch Niels. A little more work is needed to get this merged. Here are a few observations based on reading your code without any familiarity with the algorithm:

3c150393-ad85-4f5b-9939-4ebab97b47ee commented 13 years ago
comment:3

Thanks for reviewing my patch, Bursin! Here is what I have done with your comments:

The changes are in patch number 2. So please apply patch 1 first, then apply patch 2.

3c150393-ad85-4f5b-9939-4ebab97b47ee commented 13 years ago

Attachment: decoding_GB_2.patch.gz

3c150393-ad85-4f5b-9939-4ebab97b47ee commented 13 years ago

Description changed:

--- 
+++ 
@@ -1,3 +1,8 @@
 I have implemented a decoding method for general linear codes in Sage. The method decodes up to half the true minimum distance using Groebner bases. This method was introduced by Bulygin and Pellikaan.

 I was expecting the method to be faster than syndrome decoding, but it appears to be equally fast. It may be worth having this method around in Sage since Groebner basis computation may become faster. I have attached a report with my findings.
+
+Apply:
+* [attachment: trac_11413_decoding_GB.patch](https://github.com/sagemath/sage/files/ticket11413/trac_11413_decoding_GB.patch.gz)
+* [attachment: trac_11413_decoding_GB_2.patch](https://github.com/sagemath/sage/files/ticket11413/trac_11413_decoding_GB_2.patch.gz)
+
rwst commented 10 years ago
comment:9

Patches don't merge. Please rebase.

johanrosenkilde commented 7 years ago
comment:11

I had a look at this patch and the algorithm behind it to see if it would be interesting to polish it off and merge it. My conclusion is that I'm not overly excited about the algorithm, and I report my observations for others.

It's all about speed, since Sage already ships with two "maximum likelihood"-decoders: "syndrome" and "nearest neighbor". They have complexity roughly O(n*q^(n-k)) respectively O(n*q^k)). "syndrome" decoding computes a table which can be stored and make subsequent decodings fast, while "nearest neighbor" simply runs through the entire code for each decoding.

The ticket's decoding algorithm needs to solve a Grobner basis problem for each decoding, and hence its theoretical complexity is difficult to judge. We are therefore left with a comparison on how well it does in practice. Both in the original article by Bulygin and Pellikaan, as well as Niels who wrote the code, comparison is made only with syndrome decoding. Moreover they do so only with codes where k << n-k, since they mention that their decoder performs better for small k. But so does "nearest neighbor" decoding!

Indeed, the selling point of the article is codes where [n,k] = [120, 20] and [n,k] = [120, 30], where Bulygin--Pellikaan's own implementation runs in several hundreds of seconds for correcting 12 respectively 7 errors. Beyond that they do not report. However, it is clear that the complexity sharply increases with the number of errors to correct.

Comparing with Nearest Neighbor, I get the following with Sage 8.0:

sage: C = codes.random_linear_code(GF(2), 120, 20)
sage: V = C.ambient_space()
sage: timeit('C.decode_to_code(V.random_element(), "NearestNeighbor")')
5 loops, best of 3: 2.04 s per loop

sage: C = codes.random_linear_code(GF(2), 120, 22)
sage: V = C.ambient_space()
sage: timeit('C.decode_to_code(V.random_element(), "NearestNeighbor")')
5 loops, best of 3: 8.15 s per loop

We can then guess that correcting in a [120, 30] code with "nearest neighbor" would take roughly 2000 seconds. Note, this is decoding completely random vectors, which is much, much worse than what Bulygin--Pellikaan can cope with (for e.g. [120, 20] this often corrects 30-35 errors). However, the Bulygin--Pellikaan method does appear to be faster when the number of errors is small.

Best, Johan

johanrosenkilde commented 7 years ago
comment:12

Note that after #20138 this decoder should also consider how it matches up to Information-set decoding, which actually looks much more like the proposed Gröbner basis decoder than the aforementioned ML-decoders. At time of writing, this information-set decoder is much, much faster than the proposed decoder:

sage: C = codes.random_linear_code(GF(2), 120, 22) # usually has min-dist >= 32.
sage: Chan = channels.StaticErrorRateChannel(C.ambient_space(), 15)
sage: timeit('C.decode_to_code(Chan(C.random_element()), "InformationSet", 15)')
125 loops, best of 3: 3.39 ms per loop

sage: C = codes.random_linear_code(GF(2), 120, 30) # usually has min-dist >= 25.
sage: Chan = channels.StaticErrorRateChannel(C.ambient_space(), 12)
sage: timeit('C.decode_to_code(Chan(C.random_element()), "InformationSet", 12)')
125 loops, best of 3: 6.23 ms per loop