Information set decoding for linear codes

1861b8a9-77f0-4f35-8431-8514a75b40d1 commented 8 years ago

This ticket proposes an implementation of Lee-Brickell algorithm refinement for the information set decoding.

It depends on #19653 for the reimplementation of C.random_element.

Depends on #19653

CC: @jlavauzelle

Component: coding theory

Author: David Lucas, Johan Rosenkilde, Yann Laigle-Chapuy

Branch/Commit: eb2efb3

Reviewer: Yann Laigle-Chapuy

Issue created by migration from https://trac.sagemath.org/ticket/20138

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`0e6ea11`	`Let calibration return time estimates for easier debugging`
`fa72402`	`fixed search_size init bug`

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from 476369c to fa72402

2c00b582-cfe9-43b9-8124-fec470060704 commented 7 years ago

comment:39

Replying to @johanrosenkilde:

I updated the timings after a longer run with more iterations.

Replying to @sagetrac-ylchapuy:

Great, I'll take a closer loop later. Here are some preliminary remarks:

you shouldn't worry about rho, consider as an iteration only those with a valid information set. It's already what you measure with time_information_set_steps, though I would take the mean and not the median here.

I disagree: it changes the balance between the inversion step (done always) and the search step (done a rho fraction of the time). Perhaps it's not paramount, but I've already implemented it, and it makes debugging nice since the number of iterations matches extremely well.

I disagree as well! What time_information_set_steps measures is the time to get a valid information set. The while True is included into it. As is your value T is off by a factor rho.

johanrosenkilde commented 7 years ago

comment:40

Replying to @sagetrac-ylchapuy:

I disagree as well! What time_information_set_steps measures is the time to get a valid information set. The while True is included into it. As is your value T is off by a factor rho.

Oh yes, now I see, you're right. And changing the medians to means then it will probably be precise enough. The alternative is to let time_information_set_steps measure only the time once an information set is found. If e.g. q=7 then the probability of an information set (on a random code) is roughly 5/6 so there will be a fair chance that each of the 5 runs of time_information_set_steps hit an information set on the first guess - and then we will have underestimated the average time by a factor 6/5. Which likely doesn't matter at all.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from fa72402 to ea5c36c

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

ea5c36c rho is already included. Mean instead of median

johanrosenkilde commented 7 years ago

comment:42

I removed the rho factor and updated the documentation.

2c00b582-cfe9-43b9-8124-fec470060704 commented 7 years ago

comment:43

This looks nicer. Another thing I that bothers me is the search_size method.

This parameter is (sort of) specific to the Lee-Brickell algorithm, not to ISD. In preparation for other variants available (e.g. Stern-Dumer), I would prefer to hide this from the user.

I would either hide it behind an "underscored" method, or provide a more general one, e.g.

    def parameters(self):
        r"""
         Return the algo and parameters used
        """
        return ("LeeBrickell", self._search_size)

or

    def parameters(self):
        r"""
         Return the parameters selected for each available algorithms
        """
        return { "LeeBrickell": self._search_size }

or anything better you can think of.

johanrosenkilde commented 7 years ago

comment:44

This is essentially the issue that Julien brought up earlier: at what level, and how, should the different ISD implementations differ. I had proposed to defer this question till when it came up, but considering #23244 we might just as well decide something now. Julien's suggestion is to have entirely different classes. I suggested that shoving them all in the same class could be sufficient. But I'm not an expert in the other ISD algorithms, and as you point out, they take different parameters.

Your suggestion would allow for the necessary flexibility, but is essentially an indirection put in place to avoid subclassing. I don't find it particularly elegant, and it would be surprising to users I think. Granted, a typical user wouldn't care about the parameters.

Consider instead having a class for each, taking and naming their parameters as natural to each, and all subclassing a AbstractInformationSetDecoder. Then we make a function:

    def LinearCodeInformationSetDecoder(code, number_errors, algorithm=None, *args, **kwargs):
        if algorithm == None:
            algorithm = "LeeBrickell" if code.dimension() < 10 else "SternDumer"
        if algorithm == "LeeBrickell":
            return LinearCodeISLeeBrickellDecoder(code, number_errors, *args, **kwargs)
        ...

Preferably improved with a try ... except which gives a nicer error message in case args, kwargs is not proper. Probably, supplying args or kwargs should also be disallowed if algorithm is not explicitly set, in order to avoid user code silently breaking in case we sometime decide to change the automatic selection of algorithm.

2c00b582-cfe9-43b9-8124-fec470060704 commented 7 years ago

comment:45

Replying to @johanrosenkilde:

This is essentially the issue that Julien brought up earlier: at what level, and how, should the different ISD implementations differ. I had proposed to defer this question till when it came up, but considering #23244 we might just as well decide something now. Julien's suggestion is to have entirely different classes. I suggested that shoving them all in the same class could be sufficient. But I'm not an expert in the other ISD algorithms, and as you point out, they take different parameters.

I agree it's not elegant. My aim was to make this ticket as easy to close as possible. I don't want to make everything perfect, but just don't want to provide user functions we would have to deprecate in the future. As long as we have only underscored methods it's not a problem to me and we can decide later what is the best way to deal with this.

If you want to decide this now, I also think some general factory LinearCodeInformationSetDecoder choosing the best implementation at hand for those parameters, and specialized classes for users who know what they are doing seems the way to go for me.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from ea5c36c to 5463b1d

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`87bd34d`	`Restructure ISD with class hierarchy and factory method`
`59319bd`	`Docs for new classes and full doctest coverage`
`e0a6677`	`Move all ISD code to new file`
`5463b1d`	`Fix code and doctests after code migration`

johanrosenkilde commented 7 years ago

comment:47

I've implemented a class hierarchy and a factory function as described earlier. Since the ISD code appears to get lengthy, I've moved it to a new file.

Doc-tests pass, but I'm having trouble with documentation. I haven't inspected the compiled html doc yet because I got stuck in attempting to make a proper hyperlink for sage.coding.isd_decoding.LinearCodeInformationSetDecoder in the module doc of sage.coding.decoder_catalog. I'll look at it again soon.

2c00b582-cfe9-43b9-8124-fec470060704 commented 7 years ago

comment:48

Replying to @johanrosenkilde:

I've implemented a class hierarchy and a factory function as described earlier. Since the ISD code appears to get lengthy, I've moved it to a new file.

Great, I agree it's a nice thing to put it in a separate file. Though the d in isd is for decoding, so isd_decoding is a bit strange.

I look forward to review this when it' s ready, thanks for working on this!

johanrosenkilde commented 7 years ago

comment:49

Great, I agree it's a nice thing to put it in a separate file. Though the d in isd is for decoding, so isd_decoding is a bit strange.

True. But isd is a very bland abbreviation and I suspect it will be hard for most people to look at the file name and have an idea of its contents. And is_decoding is weird. Do you have a suggestion?

Similarly, I had great difficulty coming up with a proper name for the class LinearCodeISD_LeeBrickell. All other decoder classes end with Decoder. But LinearCodeInformationSetLeeBrickellDecoder is crazy long. And LinearCodeISLeeBrickellDecoder reads weirdly since IS doesn't remind of ISD. And I wanted all ISD-decoders to tab-complete similarly, so LinearCodeLeeBrickellISDecoder was out.

2c00b582-cfe9-43b9-8124-fec470060704 commented 7 years ago

comment:50

Replying to @johanrosenkilde:

Great, I agree it's a nice thing to put it in a separate file. Though the d in isd is for decoding, so isd_decoding is a bit strange.

True. But isd is a very bland abbreviation and I suspect it will be hard for most people to look at the file name and have an idea of its contents. And is_decoding is weird. Do you have a suggestion?

why not information_set_decoding we have tab completion.

Similarly, I had great difficulty coming up with a proper name for the class LinearCodeISD_LeeBrickell. All other decoder classes end with Decoder. But LinearCodeInformationSetLeeBrickellDecoder is crazy long. And LinearCodeISLeeBrickellDecoder reads weirdly since IS doesn't remind of ISD. And I wanted all ISD-decoders to tab-complete similarly, so LinearCodeLeeBrickellISDecoder was out.

Hum, I see. What would you think about only one full fledge decoder LinearCodeInformationSetDecoder, and lightweight classes for the algorithms, e.g. (just sketching a solution):

class DecodingAlgorithm:
    """ Just providing an interface.
    I can use this bland name because it would in fact be
    sage.coding.information_set_decoding.DecodingAlgorithm
    """
    @staticmethod
    def decode(G, y, **kargs):
        """ kargs depends on the actual algorithm """
        raise NotImplementedError
    @staticmethod
    def cost_estimate(C, **kargs):
        """ Return ETA in seconds for the given code and parameters
        """
        raise NotImplementedError
    @staticmethod
    def calibrate_for(C):
        """ Return the best parameters found for this code, as a dict
        You can thus do:
        A.decode(C.generator_matrix(), y, **A.calibrate_for(C))
        """
        raise NotImplementedError

class AlgorithmLeeBrickell(DecodingAlgorithm):
    # Todo

class AlgorithmSternDummer(DecodingAlgorithm):
    # Todo

# etc

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from 5463b1d to f68ea69

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`ed70a79`	`Rename`
`fa06073`	`Fixes after rename`
`f8b9c5d`	`Switched to ISD Algorithm classes and only one real decoder`
`f68ea69`	`Fixes to docstrings`

johanrosenkilde commented 7 years ago

comment:52

Finally got back to it again. I had originally considered light-weight algorithm classes but then rejected it since the surrounding ISD-class would be almost trivial. However, reconsidering I think it actually makes for a slightly nicer design, user-wise: we avoid a decoder factory (which won't behave as a decoder class which might surprise the user), it sidesteps the clumsy naming, and it becomes more transparent for a user how to make new ISD algorithms.

So I've implemented your suggestion. I went for real instantiated Algorithm-objects. This makes the algorithms more accessible to the user, and making things static would also pose problems with calibrate and time_estimate, which either both would have to actually run the calibration or write to static class variables or something.

As usual this took 30 minutes, and then 1,5 hours to redo the doc-strings.

johanrosenkilde commented 7 years ago

comment:53

Needs review. I've added Yann as coauthor - he definitely deserves that!

johanrosenkilde commented 7 years ago

Changed author from David Lucas, Johan Rosenkilde to David Lucas, Johan Rosenkilde, Yann Laigle-Chapuy

2c00b582-cfe9-43b9-8124-fec470060704 commented 7 years ago

comment:54

First review pass.

To begin with:

why give the copyright to William Stein? the format is also a bit off (see http://doc.sagemath.org/html/en/developer/coding_basics.html#headings-of-sage-library-code-files)
you should import as much as possible from sage.all instead of hardcoding the whole path (e.g. sage.modules.free_module_element) it's more future proof if things ever move around.
I find most of the "using this ISD decoding algorithm" in the documentation redundant (just like you don't write "using self")
if you ask for time_estimate with user provided parameters, it currently do the calibration and change the parameters. Is this expected? In my mind, calibrate calls time_estimate with different parameters to choose the best one.

For the algorithm itself:

the description of the Lee-Brickell differs slightly from the implementation. In step 3, we consider every subset of size less than p
in step 2, you should check if y is low weight. It actually has a good chance to be! (and you take it into account in your probability computation)
you could use (but this doesn't change much)

for gA in itertools.combinations(g, pi):
  for m in itertools.product(Fstar, repeat=pi):
    e = y - sum(mi*gAi for mi,gAi in zip(m,gA))

prefer ... is not None to not ... is None

For the decoder:

in __init__ you doctest only some exceptions and not all of them
you also don't doctest a user defined algorithm (as allowed by your test isinstance(algorithm, InformationSetAlgorithm)
in known_algorithms I would make d a class variable _known_algorithms

And finally, I would test this on bigger examples, e.g. codes.random_linear_code(GF(9), 40, 20) and weight 6 error, or whatever you like but I find this better if this is the only available option (you can't reasonably compute the coset leaders for example).

As a long doctest, we could check that on such a code the time estimate is not too far from reality (lets say within a factor 2).

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Changed commit from f68ea69 to c6ddf43

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`ee4d586`	`Merge branch 'u/jsrn/information_set_decoding' of git://trac.sagemath.org/sage into 20138_information_set_decoding`
`ed4bfd7`	`copyright and import statements`
`a4e7f49`	`time_estimate() doesn't poop on selected parameters`
`339c383`	`Fixed bugs and added doctests pointed out by reviewer.`
`0ea73d6`	`Make _known_algorithms a class field`
`08a9e50`	`Added a big example/doctest. Polished some existing ones.`
`c6ddf43`	`Added a test for the precision of calibration`

johanrosenkilde commented 6 years ago

comment:56

Thanks for the speedy review. Here is my less speedy reply.

Replying to @sagetrac-ylchapuy:

why give the copyright to William Stein? the format is also a bit off (see http://doc.sagemath.org/html/en/developer/coding_basics.html#headings-of-sage-library-code-files)

I hadn't looked up those rules to be honest. I added our three names. I don't know your email address though.

you should import as much as possible from sage.all instead of hardcoding the whole path (e.g. sage.modules.free_module_element) it's more future proof if things ever move around.

Didn't know that was possible. It's not used many places in Sage either. I don't see any drawbacks though, so I've made the change.

I find most of the "using this ISD decoding algorithm" in the documentation redundant (just like you don't write "using self")

That's a matter of style. I would always write "Return the minimum distance of this code" rather than "Return the minimum distance" or "Return the minimum distance of self". Similarly, decoding is not canonical: the result depends on the method used.

There was a discussion on sage-devel on this recently. There was no clear consensus, but many seemed favoured the "of this code"-style.

I'll keep it like this for now, but feel free to give explicit examples where you find it jarring.

if you ask for time_estimate with user provided parameters, it currently do the calibration and change the parameters. Is this expected? In my mind, calibrate calls time_estimate with different parameters to choose the best one.

Good catch: time_estimate of course shouldn't overwrite selected parameters. I've fixed that.

the description of the Lee-Brickell differs slightly from the implementation. In step 3, we consider every subset of size less than p

No, we let pi go from 0 to p, both inclusive. What differs is that I, for brevity, just say we let the coefficients roam through F^p, while we actually go through F*^pi, and let pi go from 0 to p. This is to avoid repeating the elements with multiple 0s in the first version.

in step 2, you should check if y is low weight. It actually has a good chance to be! (and you take it into account in your probability computation)

This is done in the iteration pi = 0.

you could use (but this doesn't change much)

for gA in itertools.combinations(g, pi):
  for m in itertools.product(Fstar, repeat=pi):
    e = y - sum(mi*gAi for mi,gAi in zip(m,gA))

Yes I could, but I don't consider it more elegant. If I was writing in OCaml I'd do it in the above way, obviously, but not in Python.

prefer ... is not None to not ... is None

OK

For the decoder:

in __init__ you doctest only some exceptions and not all of them

I think all __init__s and class docs now have the appropriate doctests.

you also don't doctest a user defined algorithm (as allowed by your test isinstance(algorithm, InformationSetAlgorithm)

Thank you for insisting on a doctest of this: it revealed a bug and a design dilemma. The bug was that InformationSetAlgorithm didn't hash, which meant that AbstractLinearCode.encoder crashed since it tries to hash any arguments. This is potentially a problem with AbstractLinearCode.encoder which might need to be fixed at some point, but for this case I don't see a reason not to just supply a simple hash function for InformationSetAlgorithm.

The design dilemma is that when supplying an ISD algorithm, it is superfluous to also supply number_errors to InformationSetDecoder, since the algorithm knows its decoding interval. I thought about this and decided to leave it this way, i.e. the user must supply number_errors and algorithm and number_errors must match algorithm.decoding_interval(). My argument is that this use case is very rare and for special developer-users; but allowing the shortcut of only supplying algorithm would confuse the documentation and mess up the constructor.

in known_algorithms I would make d a class variable _known_algorithms

No problem.

And finally, I would test this on bigger examples, e.g. codes.random_linear_code(GF(9), 40, 20) and weight 6 error, or whatever you like but I find this better if this is the only available option (you can't reasonably compute the coset leaders for example).

I don't like to use a random code, since it might fail to decode (granted, with very low probability). I've added an example with the [59, 30, 17] QR code over GF(3), for which syndrome decoding or nearest neighbor decoding is infeasible. The current information-set decoding is extremely fast on the other hand. I've flagged it as long, though it takes less than 0.3 s.

As a long doctest, we could check that on such a code the time estimate is not too far from reality (lets say within a factor 2).

A random doc-test should only fail with extremely low probability, otherwise it will cause headaches for other Sage devs and the build bot. The decoding time for information-set decoding is roughly exponentially distributed and has a huge variance. Thus we must average a large number of decoding trials and test this against the calibrated value.

I've implemented such a test for a medium-sized code, where we do 100 iterations, and I test that the elapsed time is within a factor 5 of calibrated. I did a 1-hour trial on my own computer with only light background processes, and factor 2 off calibrated time occurred after 400 trials while factor 3 off calibrated time occurred after 5000 trials:

from sage.coding.information_set_decoder import LeeBrickellISDAlgorithm
C = codes.QuadraticResidueCode(37, GF(3))
Chan = channels.StaticErrorRateChannel(C.ambient_space(), 4)
A = LeeBrickellISDAlgorithm(C, (0,4))
A.calibrate()
def time_100():
    A.calibrate()
    import time
    zero = C.ambient_space().zero()
    before = time.clock()
    for i in range(100): A.decode(Chan(zero)); None
    return time.clock() - before

smallest = 1
biggest = 1
i = 0
while True:
    i += 1
    elapsed = time_100()
    frac = elapsed/A.time_estimate()/100
    if frac < smallest:
        smallest = frac
    if frac > biggest:
        biggest = frac
    print "%d: %s, %s" % (i, smallest , biggest)

My theoretical musings on these probabilities indicated that - if the calibration is sound - this should occur far less frequently, however.

Barring an improvement on the calibration (which I'm not motivated to do), if we observe factors 2 and 3 quite often, we should at least use factor 5 in the field. This is also considering that unfortunate heavy background load could impact the doctest, further increasing the probability of false negatives.

Any thoughts?

(The ticket now needs review again)

2c00b582-cfe9-43b9-8124-fec470060704 commented 6 years ago

comment:57

Replying to @johanrosenkilde:

Thanks for the speedy review. Here is my less speedy reply.

you should import as much as possible from sage.all instead of hardcoding the whole path (e.g. sage.modules.free_module_element) it's more future proof if things ever move around.

Didn't know that was possible. It's not used many places in Sage either. I don't see any drawbacks though, so I've made the change.

I thought at some point it was the way to go, but it's not used much as you say. I'm not so sure now but I like it that way :)

I find most of the "using this ISD decoding algorithm" in the documentation redundant (just like you don't write "using self")

That's a matter of style. I would always write "Return the minimum distance of this code" rather than "Return the minimum distance" or "Return the minimum distance of self". Similarly, decoding is not canonical: the result depends on the method used.

There was a discussion on sage-devel on this recently. There was no clear consensus, but many seemed favoured the "of this code"-style.

I'll keep it like this for now, but feel free to give explicit examples where you find it jarring.

No keep it like that, I realized afterwards that there was no consensus on this.

the description of the Lee-Brickell differs slightly from the implementation. In step 3, we consider every subset of size less than p

No, we let pi go from 0 to p, both inclusive. What differs is that I, for brevity, just say we let the coefficients roam through F^p, while we actually go through F*^pi, and let pi go from 0 to p. This is to avoid repeating the elements with multiple 0s in the first version.

OK, that's the problem with speedy review, I read too fast sorry.

in step 2, you should check if y is low weight. It actually has a good chance to be! (and you take it into account in your probability computation)

This is done in the iteration pi = 0.

agreed

you could use (but this doesn't change much)
for gA in itertools.combinations(g, pi):
  for m in itertools.product(Fstar, repeat=pi):
    e = y - sum(mi*gAi for mi,gAi in zip(m,gA))
Yes I could, but I don't consider it more elegant. If I was writing in OCaml I'd do it in the above way, obviously, but not in Python.

It seems also a bit faster in my tests, but you can keep it that way if you don't like this.

For the decoder:

in __init__ you doctest only some exceptions and not all of them

I think all __init__s and class docs now have the appropriate doctests.

Not all, e.g. raise ValueError("All elements of number_errors have to be" ... and raise ValueError("The provided number of errors should be at"... are missing I think.

you also don't doctest a user defined algorithm (as allowed by your test isinstance(algorithm, InformationSetAlgorithm)

Thank you for insisting on a doctest of this: it revealed a bug and a design dilemma. The bug was that InformationSetAlgorithm didn't hash, which meant that AbstractLinearCode.encoder crashed since it tries to hash any arguments. This is potentially a problem with AbstractLinearCode.encoder which might need to be fixed at some point, but for this case I don't see a reason not to just supply a simple hash function for InformationSetAlgorithm.

The design dilemma is that when supplying an ISD algorithm, it is superfluous to also supply number_errors to InformationSetDecoder, since the algorithm knows its decoding interval. I thought about this and decided to leave it this way, i.e. the user must supply number_errors and algorithm and number_errors must match algorithm.decoding_interval(). My argument is that this use case is very rare and for special developer-users; but allowing the shortcut of only supplying algorithm would confuse the documentation and mess up the constructor.

That's one of the reason my Algorithm classes were more light-weight than yours. They knew nothing and were pure functions accepting arguments. The decoder was the one knowing the parameters used, avoiding the duplication.

I think it's a better design, but I won't fight for this.

in known_algorithms I would make d a class variable _known_algorithms

No problem.

And finally, I would test this on bigger examples, e.g. codes.random_linear_code(GF(9), 40, 20) and weight 6 error, or whatever you like but I find this better if this is the only available option (you can't reasonably compute the coset leaders for example).

I don't like to use a random code, since it might fail to decode (granted, with very low probability). I've added an example with the [59, 30, 17] QR code over GF(3), for which syndrome decoding or nearest neighbor decoding is infeasible. The current information-set decoding is extremely fast on the other hand. I've flagged it as long, though it takes less than 0.3 s.

You could choose a random seed such that everything goes well. But I'm fine with your QR code.

As a long doctest, we could check that on such a code the time estimate is not too far from reality (lets say within a factor 2).

[...]

Barring an improvement on the calibration (which I'm not motivated to do), if we observe factors 2 and 3 quite often, we should at least use factor 5 in the field. This is also considering that unfortunate heavy background load could impact the doctest, further increasing the probability of false negatives.

If It's not clear how to make it right, maybe it's better left for another ticket?

For an improvement in the calibration, I think I can improve things both in precision and clarity! As we want to empirically get the timing, we don't need to separate the time_information_set_steps and time_search_loop. You could just add some options:

def decode(self, r, max_iter=None, _calibrate=False):
    ...
    i = 0
    while i != max_iter:
        i += 1
        ...
                if not _calibrate: return r - e   # when we calibrate, never return

This would end up with less code duplication and a more precise estimate. Once again, feel free to leave this to another ticket if you want, it wouldn't break anything to change this later.

Any thoughts?

If one chooses anything aboved/2, the algorithm does not guarantee to return a nearest codeword.

I would add we don't even guarantee to return a codeword!

(The ticket now needs review again)

I still have to check the rendering of the documentation, but this get's close to a positive review from me.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`1557ce9`	*args should pass to decoder in `AbstractLinearCode.decode_to_code/message`
`395ce38`	`Tests for all raised exceptions`
`5924a39`	`Improved calibration doc test`

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Changed commit from c6ddf43 to 5924a39

johanrosenkilde commented 6 years ago

comment:59

Replying to @sagetrac-ylchapuy:

I think all __init__s and class docs now have the appropriate doctests.

Not all, e.g. raise ValueError("All elements of number_errors have to be" ... and raise ValueError("The provided number of errors should be at"... are missing I think.

You're right, sorry. I went over the entire file more carefully now.

That's one of the reason my Algorithm classes were more light-weight than yours. They knew nothing and were pure functions accepting arguments. The decoder was the one knowing the parameters used, avoiding the duplication.

I think it's a better design, but I won't fight for this.

OK, let's not debate this further. If you feel strongly about it when you implement Stern-Dumer, you can implement the change there.

As a long doctest, we could check that on such a code the time estimate is not too far from reality (lets say within a factor 2).

[...]

Barring an improvement on the calibration (which I'm not motivated to do), if we observe factors 2 and 3 quite often, we should at least use factor 5 in the field. This is also considering that unfortunate heavy background load could impact the doctest, further increasing the probability of false negatives.

If It's not clear how to make it right, maybe it's better left for another ticket?

Possibly. I generally don't see how tests of calibration and threshold selection algorithms fit into Sage's doctesting ecology.

I slightly improved my solution from before (run multiple calibrations). I feel pretty confident that the test there now is better than nothing and should not really cause false negatives. If you don't feel comfortable with it we can also remove it again.

For an improvement in the calibration, I think I can improve things both in precision and clarity!

I see your idea. I originally wanted to avoid this for two reasons: 1) it pollutes the code of the real algorithm; and 2) it adds unnecessary logic to a busy inner-loop. Probably 2) is totally negligible though, and 1) is true but the alternative is code duplication.

Let's wait with this for another ticket. I'm waiting to see how Stern-Dumer performs in practice before spending lots of time optimising Lee-Brickell (I have other improvements in mind as well).

Keep in mind for your solution though, that you also need some mechanism to avoid large values of p where a single iteration would take very long.

If one chooses anything aboved/2, the algorithm does not guarantee to return a nearest codeword.

I would add we don't even guarantee to return a codeword!

What do you mean? The algorithm might not return, but if it does it surely returns a codeword. That it might not return should be covered by the red warning box below.

I still have to check the rendering of the documentation, but this get's close to a positive review from me.

Great.

New commits:

`1557ce9`	*args should pass to decoder in `AbstractLinearCode.decode_to_code/message`
`395ce38`	`Tests for all raised exceptions`
`5924a39`	`Improved calibration doc test`

2c00b582-cfe9-43b9-8124-fec470060704 commented 6 years ago

comment:60

two low hanging fruits, the bot complains about

a doctest failure in src/doc/en/constructions/linear_codes.rst
a missing doctest:

Missing doctests in coding/information_set_decoder.py: 23 / 24 = 95%

it' s the __hash__.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Changed commit from 5924a39 to 50018f6

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

50018f6 Fix doctest failure in doc. Add docstring and test to __hash__

johanrosenkilde commented 6 years ago

comment:63

Indeed. These omissions are fixed now.

2c00b582-cfe9-43b9-8124-fec470060704 commented 6 years ago

comment:64

Ok, I finally took the time to build the doc and check it out. Build without a fuss and LGTM.

Your long doctest with codes.QuadraticResidueCode(59, GF(3)) takes around 150ms on my machine, I don't know if I would call this long but it's not a problem.

The bot is also green, so let's move forward: positive review :)

johanrosenkilde commented 6 years ago

comment:65

Great! Thanks for the cooperation - I think this has become a really nice addition to Sage. I'm looking forward to seeing your implementation in #23244.

vbraun commented 6 years ago

comment:66

sage -t --long src/sage/coding/information_set_decoder.py
**********************************************************************
File "src/sage/coding/information_set_decoder.py", line 553, in sage.coding.information_set_decoder.LeeBrickellISDAlgorithm.calibrate
Failed example:
    A.time_estimate()/5 < avg and avg < A.time_estimate() * 5 # long time
Expected:
    True
Got:
    False
**********************************************************************
1 item had failures:
   1 of  15 in sage.coding.information_set_decoder.LeeBrickellISDAlgorithm.calibrate
    [170 tests, 1 failure, 5.71 s]

kiwifb commented 6 years ago

comment:67

Replying to @vbraun:

sage -t --long src/sage/coding/information_set_decoder.py
**********************************************************************
File "src/sage/coding/information_set_decoder.py", line 553, in sage.coding.information_set_decoder.LeeBrickellISDAlgorithm.calibrate
Failed example:
    A.time_estimate()/5 < avg and avg < A.time_estimate() * 5 # long time
Expected:
    True
Got:
    False
**********************************************************************
1 item had failures:
   1 of  15 in sage.coding.information_set_decoder.LeeBrickellISDAlgorithm.calibrate
    [170 tests, 1 failure, 5.71 s]

I couldn't get that on my setup while repeating the test a 1000 time.

johanrosenkilde commented 6 years ago

comment:68

Hmm, we were also in doubt about that test, as you can see from the discussion above. I had believed that it was fairly stable now, but evidently timing behaviour on the build bots are not very dependable.

So perhaps we should simply remove the test - what do you think? Any alternative suggestions for how we could test the calibration code?

vbraun commented 6 years ago

comment:69

Don't test that the timings have particular values
Do test that timings can be collected
Do test that your code makes the right decisions when you specify timings

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

eb2efb3 Replace brittle calibration test with a test for proper selection

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Changed commit from 50018f6 to eb2efb3

johanrosenkilde commented 6 years ago

comment:71

Replying to @vbraun:

Don't test that the timings have particular values

Do test that timings can be collected

Do test that your code makes the right decisions when you specify timings

OK, I've pushed a patch removing the offending test and instead making a small refactor such that we can test the last item on your list.

It seems a pity, however, that there is no good way to test that the calibrated timings have anything to do with reality.

2c00b582-cfe9-43b9-8124-fec470060704 commented 6 years ago

comment:73

OK, let's do it that way (and the failure in the docbuild is unrelated)

LGTM

vbraun commented 6 years ago

Changed branch from u/jsrn/information_set_decoding to eb2efb3

sagemath / sage

Information set decoding for linear codes #20138