Closed 1861b8a9-77f0-4f35-8431-8514a75b40d1 closed 8 years ago
Branch: u/dlucas/punctured_codes
Content available, it's now ready for review.
New commits:
37683a1 | New Punctured Codes object, coming with its own encoder |
Commit: 37683a1
AbstractLinearCode
has a method punctured
. Shouldn't this return a PuncturedCode
object after this patch?
I chose to keep this method as it directly returns a LinearCode
object.
So if one wants to puncture and immediately get the LinearCode
representation of his punctured code, he can thanks to this method. I plan to change it when the mechanism to get another representation of a punctured code (e.g. get a RS code from a punctured RS code) will be ready.
Author: David Lucas
I think the default should always be to retain as much structure as possible. The C.punctured()
method is certainly the "default" a user would call.
Furthermore, having a PuncturedCode
should never get in the way for users who are simply interested in the unstructured linear code representation: it advertises exactly the same methods (plus some more). So I don't think there is any deprecation needed.
Branch pushed to git repo; I updated commit sha1. New commits:
7f9d27e | AbstractLinearCode's punctured method now returns a PuncturedCode object |
Ok, I agree with you.
Changes have been made and pushed.
A few updates on this ticket:
This is still open for review.
Description changed:
---
+++
@@ -6,10 +6,13 @@
- a dedicated encoder to compute the generator matrix of a Punctured Code.
+- a dedicate decoder which uses the original code to correct errors
+
This new structure presents the following advantages:
-- it keeps track of the code it comes from. If one punctures a code A, the punctured code he gets will remember it comes from A.
+- it keeps track of the code it comes from. If one punctures a code A, the punctured code one gets will remember it comes from A.
- This allows to perform several operations, like encoding or picking a random element without constructing the generator matrix for the punctured code, thus saving time and memory.
-- Later on, it will also allow to decode words using a decoder over the original code of the punctured code.
+- because it keeps track of the original code, it's possible to get a structured representation of the punctured code, e.g. if one punctures a GRS code and asks for the structured representation of the punctured code one got, a GRS code will be returned.
+
Branch pushed to git repo; I updated commit sha1. New commits:
26aae17 | Fixed conflict after update to 7.1beta0 |
I updated this ticket to latest beta, and fixed merge conflict. It's still open for review.
Branch pushed to git repo; I updated commit sha1. Last 10 new commits:
8b654ac | Renamed method precompute and changed its doc |
b60ca15 | Rewrote documentation of partial_xgcd and made it a private method |
5ed4ac3 | Fixed error in error-erasure's decoding radius |
03c15f3 | Removed useless backslashes |
cc3cb4d | Rewrote error message in KeyEquationSyndromeDecoder |
a7f3ee1 | Fixed typos |
c8bc406 | Rewrote some examples |
7cbcca2 | Rewrote some sentences |
4aebb5a | Fixed bug related to full codes, plus some minor tweaks |
fa355cd | Enhanced decoder documentation |
I fixed a few typos, updated deprecated imports which were breaking doctests. I also completely rewrote doctests and documentation of the decoder, using GRS decoders introduced in #19653.
This is still open for review.
Dependencies: #19653
Branch pushed to git repo; I updated commit sha1. Last 10 new commits:
a074cb8 | Changed my helper methods into private methods |
6ad583f | Fixed bug in KeyEquation decoder's decode_to_code, added sanity checks, enhanced doctests |
fca099e | Added new sanity check on the output of BW and Gao decoder |
57dbfbf | Rewrote random_element method |
7953d60 | Proper sanity checks for output of decode_to_* methods |
8673ac5 | Optimized a bit polynomial division in Gao and BW |
e1b6b09 | Merged now postive reviewed #19666 and fixed conflicts |
5093dce | Update to 7.1beta5 |
f3b4b11 | Fixed typo in Guruswami-Sudan doc |
5c61b11 | Merged dependecies and fixed conflicts |
I updated this ticket to the latest beta and fixed conflicts. This is still open for review.
I updated this ticket to 7.2beta1 and fixed a broken doctest. This is still open for review.
I start the review even though the ticket is not updated to the latest beta. Despite my comments, note that I realize that building this framework must have been a huge work, and that I found it really user-friendly =)
General remarks
PuncturedCodeOriginalCodeDecoder
), I asked myself the following question: if I understand well your naming convention, decoders are named Code
+DecoderName
. Is it necessary to have the Code
suffix ? My point is that it could be sufficient to only keep DecoderName
, because if a user wants to know what kind of code the decoder is applied to, he just needs to call Dec.code()
.index.rst
sage/coding/source_coding/huffman
?src/sage/coding/grs.py
_punctured_form(self, points)
doesn't work if points
is not sorted (e.g. C_grs._punctured_form([4,3])
). That's due to your strange way to select the evaluation points and multipliers. I think you should remove the block:start = 0
for i in points:
punctured_alphas += alphas[start:i]
punctured_col_mults += col_mults[start:i]
start = i + 1
punctured_alphas += alphas[start:n]
punctured_col_mults += col_mults[start:n]
and replace it by something like (that's only a suggestion):
punctured_alphas = [ alphas[i] for i in range(n) if i not in points ]
punctured_col_mults = [ col_mults[i] for i in range(n) if i not in points ]
if G.rank() != dimension:
G = G.echelon_form()
for i in range(dimension):
if G[i] == 0:
dimension -= 1
aims at computing the code dimension (stored in dimension
). In fact, you already computed it in G.rank()
.
src/sage/coding/linear_code.py
_punctured_form
method, G.echelon_form()
puts all the zero lines at the bottom of the new matrix. Thus, if you want to get the non-zero part of the matrix, I think you could simply write k = G.rank()
return LinearCode(G[:k])
src/sage/coding/punctured_code.py
--- function puncture
---
C = codes.RandomLinearCode(11, 5, GF(7))
Cp = codes.PuncturedCode(C, [4,3])
v = vector(GF(7), (2,3,0,2,1,5,1,5,6,5,3))
sage.coding.punctured_code.puncture(v, [4,3], Cp)
--- class PuncturedCode
---
positions = list(set(positions))
is a short (maybe dirty) way to do:unique_positions = set()
for i in positions:
unique_positions.add(i)
positions = []
for i in unique_positions:
positions.append(i)
and it looks to sort the elements as well =) Also remark that you don't have this "multiplicity" issue when using Python sets (instead of lists) for storing puncturing points.
self.punctured_positions
in return puncture(c, self.punctured_positions, self)
sage: C = codes.GeneralizedReedSolomonCode(GF(7).list(), 4)
sage: P = codes.PuncturedCode(C, [1,3])
sage: Q = codes.PuncturedCode(P, [1,3])
sage: P.length()
5
sage: Q.length()
3
sage: P.structured_representation()
[5, 4, 2] Generalized Reed-Solomon Code over Finite Field of size 7
sage: Q.structured_representation()
[5, 3, 3] Generalized Reed-Solomon Code over Finite Field of size 7
So, double puncturing works well (Q
seems to be the right code), but structured_representation()
doesn't build the right Reed-Solomon code.
--- class PuncturedCodePuncturedMatrixEncoder
---
GeneratorMatrix
: same comment as in linear_code.py, I think echelon_form()do the job and you just need to keep the k
first rows.--- class PuncturedCodeOriginalCodeDecoder
---
sage: R = codes.RandomLinearCode(11, 5, GF(7))
sage: P = codes.PuncturedCode(R, [1,3])
sage: P.decoder()
'random_values'
instead of 'random-values'
error_erasure
to which you assign 0
and 1
. Maybe True
and False
are more explicit.decode_to_code
: l.600, I think you forgot to increment shift
(otherwise I misunderstood the use of shift
)elif self._strategy == 'try-all'
at line 643.ValueError("The number of erasures exceed decoding capability")
--> exceeds return (diff - punctured -1) // 2
if self._strategy != 'try-all' and "error-erasure" not in D.decoder_type():
return D.decoding_radius() - punctured
return D.decoding_radius()
, line 675. To which case does it apply ?Branch pushed to git repo; I updated commit sha1. New commits:
050d00d | Merge branch 'develop' into punctured_code |
31fbb4f | positions is now a set instead of a list |
e3f8f6b | Reinstated huffman in index.rst |
564760f | Rewrote _punctured_form in grs.py |
0746385 | Changed method _punctured_form in linear_code.py |
a27ca48 | Rewrote method puncture in punctured_code |
b0decac | Changed docstring in structured_representation |
d16f13a | Simplified generator_matrix computation in the encoder |
21b3570 | Added generic decoders as available decoders for punctured codes |
659105c | Some tweaking to punctured_code.py |
Hello Julien,
Thanks for your thorough review!
I followed all your suggestions (good idea to change the set of points to a list!). Some remarks/answers follow:
positions
is now a set, the test of duplicate positions in the constructor is no longer needed.positions
to a set, the str
methods return the ugly str format of a set: set([3,4])
for {3, 4}
. Which means the string representation of a punctured code now states PuncturedCode coming from ... punctured in position(s) set([pos])
which is quite ugly... So, do you think I should keep that, or, only for these str
methods return a list while it might be a bit confusing for the user because he/she entered a set a got a list?sage is stuck (infinite loop ?) when I run this:
Oh, yes, it's not an infinite loop! It's because it builds the original decoder of your code, namely a SyndromeDecoder
for a linear code over GF(7), length 11, dimension 5... Which takes a while because of the table of syndromes which is heavy to build.
Best,
David
I'd prefer it if the user could enter either a set or a list of punctured positions, or a single position. The internal representation as a set probably makes sense though. I consider str and repr as just pretty-printing, so formatting as a (sorted) list make sense.
Branch pushed to git repo; I updated commit sha1. New commits:
cef203f | Reinstated list as an allowed input type for punctured codes. Changed str methods for punctured codes. |
Hello,
The user can now pass either an Integer, an int, a list or a set to PuncturedCode
. The internal representation is a set. I added a few extra sanitization tests, so if one passes a list/set of anything but ints/Integers, an exception is raised.
I also reinstated the list representation of the punctured positions for str methods.
Remark:
for now, punctured_positions
(the getter for positions
) always return a set, even if one passed an int/Integer/list at construction time. I think it's fine like that, but if you think it can be confusing/annoying for the user, I can change its behaviour so it returns the exact data one passed at construction time.
This is still open for review.
Best,
David
Awesome!
for now,
punctured_positions
(the getter forpositions
) always return a set, even if one passed an int/Integer/list at construction time. I think it's fine like that, but if you think it can be confusing/annoying for the user, I can change its behaviour so it returns the exact data one passed at construction time.
I think this makes complete sense.
Branch pushed to git repo; I updated commit sha1. New commits:
2d13b55 | Changed punctured method to support a list of vectors |
Hello,
I fixed a bug: Punctured Code's decoder was not working if its original decoder was a list decoder.
I changed puncture
method (which is BTW now called _puncture
as it's a helper function): it can now puncture a list of vectors or a vector. This takes care of the list-decoding issue.
Best,
David
Branch pushed to git repo; I updated commit sha1. New commits:
771f3fe | decoder_type works now properly on uninstantiated classes |
0d8ed86 | Merge branch 'develop' into decoder_type_method_for_uninstanciated_classes |
c2777ff | Merge branch 'decoder_type_method_for_uninstanciated_classes' into talk_secret |
021b841 | Merge branch 'punctured_code' into talk_secret |
5bb39fe | Fixed bug related to sets and sorting in decode_to_code |
Changed branch from u/dlucas/punctured_codes to u/dlucas/punctured_code
Changed commit from 5bb39fe
to none
Branch pushed to git repo; I updated commit sha1. Last 10 new commits:
564760f | Rewrote _punctured_form in grs.py |
0746385 | Changed method _punctured_form in linear_code.py |
a27ca48 | Rewrote method puncture in punctured_code |
b0decac | Changed docstring in structured_representation |
d16f13a | Simplified generator_matrix computation in the encoder |
21b3570 | Added generic decoders as available decoders for punctured codes |
659105c | Some tweaking to punctured_code.py |
cef203f | Reinstated list as an allowed input type for punctured codes. Changed str methods for punctured codes. |
2d13b55 | Changed punctured method to support a list of vectors |
36e71c5 | Fixed bug related to sets and sorting in decode_to_code |
Commit: 36e71c5
Sorry, I messed up with my git branches in the push before that.
Anyway, my last commit fixes a bug related to decode_to_code
: the loops I was using to insert elements in lists were failing because they were expecting to receive the positions to insert as a sorted list of positions.
I wrote a helper functions to deal with this, and it now works fine.
This is still open for review (and this time, I cannot find any more issues).
Best,
David
Hi David,
A few remarks -- the last ones, I hope:
structured_representation
method: it doesn't work when one concatenates puncturings. For example, the following script C = codes.GeneralizedReedSolomonCode(GF(7).list(), 4)
P = codes.PuncturedCode(C, [0,6])
D = P.structured_representation()
P2 = codes.PuncturedCode(P, [0,4])
D2 = P2.structured_representation()
fails at fifth line (so double puncturing actually works, only structured_representation
crashes). From the crash log, it seems that you don't build the right set of pts
.
PuncturedCodePuncturedMatrixEncoder
, one can construct this encoder while passing as argument a code which is not a punctured one. As a direct consequence, this scripts fails: C = codes.GeneralizedReedSolomonCode(GF(7).list(), 4)
E = codes.encoders.PuncturedCodePuncturedMatrixEncoder(C)
G = E.generator_matrix()
The question is: is it the user's mistake, or should the code handle it ? Maybe it's reasonable to throw an error.
same remark for your decoder: one can pass a non-punctured code as parameter.
in __init__
function of PuncturedCodeOriginalCodeDecoder
, maybe it is better to use True/False
boolean values instead of error_erasure = 0
and error_erasure = 1
.
I also prefer the compact notation:
e_list = [ one if i in pts else zero for i in range(Cor.length()) ]
instead of:
e_list = []
for i in range(Cor.length()):
if i in pts:
e_list.append(one)
else:
e_list.append(zero)
I did not perform exhaustive tests of all the combinations of codes/encoders/decoders/strategies, because it is too long. So given the complexity of the feature you're trying to implement, it's still possible there remains some minor errors -- especially with extreme settings -- but maybe a real deep use of this class (by actual users) could make them appear more easily.
Thus, when you're done fixing the previous errors, I give the green light.
Julien
This ticket proposes a new implementation for punctured codes.
It contains:
a new code class,
PuncturedCode
a dedicated encoder to compute the generator matrix of a Punctured Code.
a dedicate decoder which uses the original code to correct errors
This new structure presents the following advantages:
it keeps track of the code it comes from. If one punctures a code A, the punctured code one gets will remember it comes from A.
This allows to perform several operations, like encoding or picking a random element without constructing the generator matrix for the punctured code, thus saving time and memory.
because it keeps track of the original code, it's possible to get a structured representation of the punctured code, e.g. if one punctures a GRS code and asks for the structured representation of the punctured code one got, a GRS code will be returned.
Depends on #19653
CC: @wdjoyner @ppurka
Component: coding theory
Author: David Lucas
Branch/Commit:
45d4ee1
Reviewer: Julien Lavauzelle
Issue created by migration from https://trac.sagemath.org/ticket/19422