sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.46k stars 484 forks source link

A new structure for Punctured Codes #19422

Closed 1861b8a9-77f0-4f35-8431-8514a75b40d1 closed 8 years ago

1861b8a9-77f0-4f35-8431-8514a75b40d1 commented 9 years ago

This ticket proposes a new implementation for punctured codes.

It contains:

This new structure presents the following advantages:

Depends on #19653

CC: @wdjoyner @ppurka

Component: coding theory

Author: David Lucas

Branch/Commit: 45d4ee1

Reviewer: Julien Lavauzelle

Issue created by migration from https://trac.sagemath.org/ticket/19422

1861b8a9-77f0-4f35-8431-8514a75b40d1 commented 9 years ago

Branch: u/dlucas/punctured_codes

1861b8a9-77f0-4f35-8431-8514a75b40d1 commented 9 years ago
comment:2

Content available, it's now ready for review.


New commits:

37683a1New Punctured Codes object, coming with its own encoder
1861b8a9-77f0-4f35-8431-8514a75b40d1 commented 9 years ago

Commit: 37683a1

johanrosenkilde commented 9 years ago
comment:3

AbstractLinearCode has a method punctured. Shouldn't this return a PuncturedCode object after this patch?

1861b8a9-77f0-4f35-8431-8514a75b40d1 commented 9 years ago
comment:4

I chose to keep this method as it directly returns a LinearCode object. So if one wants to puncture and immediately get the LinearCode representation of his punctured code, he can thanks to this method. I plan to change it when the mechanism to get another representation of a punctured code (e.g. get a RS code from a punctured RS code) will be ready.

1861b8a9-77f0-4f35-8431-8514a75b40d1 commented 9 years ago

Author: David Lucas

johanrosenkilde commented 9 years ago
comment:6

I think the default should always be to retain as much structure as possible. The C.punctured() method is certainly the "default" a user would call.

Furthermore, having a PuncturedCode should never get in the way for users who are simply interested in the unstructured linear code representation: it advertises exactly the same methods (plus some more). So I don't think there is any deprecation needed.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

7f9d27eAbstractLinearCode's punctured method now returns a PuncturedCode object
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 37683a1 to 7f9d27e

1861b8a9-77f0-4f35-8431-8514a75b40d1 commented 9 years ago
comment:8

Ok, I agree with you.

Changes have been made and pushed.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from 7f9d27e to ca859cb

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

4473a2cUpdated to latest beta and fixed conflicts
74fc44eAdded decoder for punctured codes
ca859cbAdded a mechanism to get the structured representation of any punctured code
1861b8a9-77f0-4f35-8431-8514a75b40d1 commented 8 years ago
comment:10

A few updates on this ticket:

This is still open for review.

1861b8a9-77f0-4f35-8431-8514a75b40d1 commented 8 years ago

Description changed:

--- 
+++ 
@@ -6,10 +6,13 @@

 - a dedicated encoder to compute the generator matrix of a Punctured Code.

+- a dedicate decoder which uses the original code to correct errors
+
 This new structure presents the following advantages:

-- it keeps track of the code it comes from. If one punctures a code A, the punctured code he gets will remember it comes from A.
+- it keeps track of the code it comes from. If one punctures a code A, the punctured code one gets will remember it comes from A.

 - This allows to perform several operations, like encoding or picking a random element without constructing the generator matrix for the punctured code, thus saving time and memory.

-- Later on, it will also allow to decode words using a decoder over the original code of the punctured code.
+- because it keeps track of the original code, it's possible to get a structured representation of the punctured code, e.g. if one punctures a GRS code and asks for the structured representation of the punctured code one got, a GRS code will be returned.
+
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

26aae17Fixed conflict after update to 7.1beta0
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from ca859cb to 26aae17

1861b8a9-77f0-4f35-8431-8514a75b40d1 commented 8 years ago
comment:12

I updated this ticket to latest beta, and fixed merge conflict. It's still open for review.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

8b654acRenamed method precompute and changed its doc
b60ca15Rewrote documentation of partial_xgcd and made it a private method
5ed4ac3Fixed error in error-erasure's decoding radius
03c15f3Removed useless backslashes
cc3cb4dRewrote error message in KeyEquationSyndromeDecoder
a7f3ee1Fixed typos
c8bc406Rewrote some examples
7cbcca2Rewrote some sentences
4aebb5aFixed bug related to full codes, plus some minor tweaks
fa355cdEnhanced decoder documentation
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from 26aae17 to fa355cd

1861b8a9-77f0-4f35-8431-8514a75b40d1 commented 8 years ago
comment:14

I fixed a few typos, updated deprecated imports which were breaking doctests. I also completely rewrote doctests and documentation of the decoder, using GRS decoders introduced in #19653.

This is still open for review.

1861b8a9-77f0-4f35-8431-8514a75b40d1 commented 8 years ago

Dependencies: #19653

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

a074cb8Changed my helper methods into private methods
6ad583fFixed bug in KeyEquation decoder's decode_to_code, added sanity checks, enhanced doctests
fca099eAdded new sanity check on the output of BW and Gao decoder
57dbfbfRewrote random_element method
7953d60Proper sanity checks for output of decode_to_* methods
8673ac5Optimized a bit polynomial division in Gao and BW
e1b6b09Merged now postive reviewed #19666 and fixed conflicts
5093dceUpdate to 7.1beta5
f3b4b11Fixed typo in Guruswami-Sudan doc
5c61b11Merged dependecies and fixed conflicts
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from fa355cd to 5c61b11

1861b8a9-77f0-4f35-8431-8514a75b40d1 commented 8 years ago
comment:17

I updated this ticket to the latest beta and fixed conflicts. This is still open for review.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from 5c61b11 to 69da167

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

f2e8c28Merge branch 'develop' into punctured_code
69da167Fixed broken doctest
1861b8a9-77f0-4f35-8431-8514a75b40d1 commented 8 years ago
comment:19

I updated this ticket to 7.2beta1 and fixed a broken doctest. This is still open for review.

jlavauzelle commented 8 years ago
comment:20

I start the review even though the ticket is not updated to the latest beta. Despite my comments, note that I realize that building this framework must have been a huge work, and that I found it really user-friendly =)

General remarks

index.rst

src/sage/coding/grs.py

start = 0
for i in points:
    punctured_alphas += alphas[start:i]
    punctured_col_mults += col_mults[start:i]
    start = i + 1
    punctured_alphas += alphas[start:n]
    punctured_col_mults += col_mults[start:n]

and replace it by something like (that's only a suggestion):

punctured_alphas = [ alphas[i] for i in range(n) if i not in points ]
punctured_col_mults = [ col_mults[i] for i in range(n) if i not in points ]
if G.rank() != dimension:
    G = G.echelon_form()
    for i in range(dimension):
        if G[i] == 0:
        dimension -= 1

aims at computing the code dimension (stored in dimension). In fact, you already computed it in G.rank().

src/sage/coding/linear_code.py

    k = G.rank()
    return LinearCode(G[:k])
jlavauzelle commented 8 years ago
comment:21

src/sage/coding/punctured_code.py --- function puncture ---

C = codes.RandomLinearCode(11, 5, GF(7))
Cp = codes.PuncturedCode(C, [4,3])
v = vector(GF(7), (2,3,0,2,1,5,1,5,6,5,3))
sage.coding.punctured_code.puncture(v, [4,3], Cp)

--- class PuncturedCode ---

unique_positions = set()
for i in positions:
    unique_positions.add(i)
positions = []
for i in unique_positions:
   positions.append(i)

and it looks to sort the elements as well =) Also remark that you don't have this "multiplicity" issue when using Python sets (instead of lists) for storing puncturing points.

sage: C = codes.GeneralizedReedSolomonCode(GF(7).list(), 4)
sage: P = codes.PuncturedCode(C, [1,3])
sage: Q = codes.PuncturedCode(P, [1,3])
sage: P.length()
5
sage: Q.length()
3
sage: P.structured_representation()
[5, 4, 2] Generalized Reed-Solomon Code over Finite Field of size 7
sage: Q.structured_representation()
[5, 3, 3] Generalized Reed-Solomon Code over Finite Field of size 7

So, double puncturing works well (Q seems to be the right code), but structured_representation() doesn't build the right Reed-Solomon code.

--- class PuncturedCodePuncturedMatrixEncoder ---

--- class PuncturedCodeOriginalCodeDecoder ---

sage: R = codes.RandomLinearCode(11, 5, GF(7))
sage: P = codes.PuncturedCode(R, [1,3])
sage: P.decoder()
    return (diff - punctured -1) // 2
if self._strategy != 'try-all' and "error-erasure" not in D.decoder_type():
    return D.decoding_radius() - punctured
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from 69da167 to 659105c

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

050d00dMerge branch 'develop' into punctured_code
31fbb4fpositions is now a set instead of a list
e3f8f6bReinstated huffman in index.rst
564760fRewrote _punctured_form in grs.py
0746385Changed method _punctured_form in linear_code.py
a27ca48Rewrote method puncture in punctured_code
b0decacChanged docstring in structured_representation
d16f13aSimplified generator_matrix computation in the encoder
21b3570Added generic decoders as available decoders for punctured codes
659105cSome tweaking to punctured_code.py
1861b8a9-77f0-4f35-8431-8514a75b40d1 commented 8 years ago
comment:23

Hello Julien,

Thanks for your thorough review!

I followed all your suggestions (good idea to change the set of points to a list!). Some remarks/answers follow:

sage is stuck (infinite loop ?) when I run this:

Oh, yes, it's not an infinite loop! It's because it builds the original decoder of your code, namely a SyndromeDecoder for a linear code over GF(7), length 11, dimension 5... Which takes a while because of the table of syndromes which is heavy to build.

Best,

David

johanrosenkilde commented 8 years ago
comment:24

I'd prefer it if the user could enter either a set or a list of punctured positions, or a single position. The internal representation as a set probably makes sense though. I consider str and repr as just pretty-printing, so formatting as a (sorted) list make sense.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

cef203fReinstated list as an allowed input type for punctured codes. Changed str methods for punctured codes.
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from 659105c to cef203f

1861b8a9-77f0-4f35-8431-8514a75b40d1 commented 8 years ago
comment:26

Hello,

The user can now pass either an Integer, an int, a list or a set to PuncturedCode. The internal representation is a set. I added a few extra sanitization tests, so if one passes a list/set of anything but ints/Integers, an exception is raised.

I also reinstated the list representation of the punctured positions for str methods.

Remark:

for now, punctured_positions (the getter for positions) always return a set, even if one passed an int/Integer/list at construction time. I think it's fine like that, but if you think it can be confusing/annoying for the user, I can change its behaviour so it returns the exact data one passed at construction time.

This is still open for review.

Best,

David

johanrosenkilde commented 8 years ago
comment:27

Awesome!

for now, punctured_positions (the getter for positions) always return a set, even if one passed an int/Integer/list at construction time. I think it's fine like that, but if you think it can be confusing/annoying for the user, I can change its behaviour so it returns the exact data one passed at construction time.

I think this makes complete sense.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

2d13b55Changed punctured method to support a list of vectors
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from cef203f to 2d13b55

1861b8a9-77f0-4f35-8431-8514a75b40d1 commented 8 years ago
comment:29

Hello,

I fixed a bug: Punctured Code's decoder was not working if its original decoder was a list decoder. I changed puncture method (which is BTW now called _puncture as it's a helper function): it can now puncture a list of vectors or a vector. This takes care of the list-decoding issue.

Best,

David

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

771f3fedecoder_type works now properly on uninstantiated classes
0d8ed86Merge branch 'develop' into decoder_type_method_for_uninstanciated_classes
c2777ffMerge branch 'decoder_type_method_for_uninstanciated_classes' into talk_secret
021b841Merge branch 'punctured_code' into talk_secret
5bb39feFixed bug related to sets and sorting in decode_to_code
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from 2d13b55 to 5bb39fe

1861b8a9-77f0-4f35-8431-8514a75b40d1 commented 8 years ago

Changed branch from u/dlucas/punctured_codes to u/dlucas/punctured_code

1861b8a9-77f0-4f35-8431-8514a75b40d1 commented 8 years ago

Changed commit from 5bb39fe to none

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

564760fRewrote _punctured_form in grs.py
0746385Changed method _punctured_form in linear_code.py
a27ca48Rewrote method puncture in punctured_code
b0decacChanged docstring in structured_representation
d16f13aSimplified generator_matrix computation in the encoder
21b3570Added generic decoders as available decoders for punctured codes
659105cSome tweaking to punctured_code.py
cef203fReinstated list as an allowed input type for punctured codes. Changed str methods for punctured codes.
2d13b55Changed punctured method to support a list of vectors
36e71c5Fixed bug related to sets and sorting in decode_to_code
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Commit: 36e71c5

1861b8a9-77f0-4f35-8431-8514a75b40d1 commented 8 years ago
comment:33

Sorry, I messed up with my git branches in the push before that. Anyway, my last commit fixes a bug related to decode_to_code: the loops I was using to insert elements in lists were failing because they were expecting to receive the positions to insert as a sorted list of positions. I wrote a helper functions to deal with this, and it now works fine. This is still open for review (and this time, I cannot find any more issues).

Best,

David

jlavauzelle commented 8 years ago
comment:34

Hi David,

A few remarks -- the last ones, I hope:

    C = codes.GeneralizedReedSolomonCode(GF(7).list(), 4)
    P = codes.PuncturedCode(C, [0,6])
    D = P.structured_representation()
    P2 = codes.PuncturedCode(P, [0,4])
    D2 = P2.structured_representation()

fails at fifth line (so double puncturing actually works, only structured_representation crashes). From the crash log, it seems that you don't build the right set of pts.

    C = codes.GeneralizedReedSolomonCode(GF(7).list(), 4)
    E = codes.encoders.PuncturedCodePuncturedMatrixEncoder(C)
    G = E.generator_matrix()

The question is: is it the user's mistake, or should the code handle it ? Maybe it's reasonable to throw an error.

    e_list = [ one if i in pts else zero for i in range(Cor.length()) ]

instead of:

    e_list = []
    for i in range(Cor.length()):
        if i in pts:
            e_list.append(one)
        else:
            e_list.append(zero)

I did not perform exhaustive tests of all the combinations of codes/encoders/decoders/strategies, because it is too long. So given the complexity of the feature you're trying to implement, it's still possible there remains some minor errors -- especially with extreme settings -- but maybe a real deep use of this class (by actual users) could make them appear more easily.

Thus, when you're done fixing the previous errors, I give the green light.

Julien

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Changed commit from 36e71c5 to 8a3a084

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 8 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

ddedfc3Update to latest beta
f964812Fixed bug in index.rst file
6afe4a9Fixed bug related in structured_representation
890b309Added input check for encoder and decoder
8a3a084Tweaks: fixed typo, removed not Pythonic syntax