sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.41k stars 475 forks source link

adding Delsarte bound for codes #12418

Closed dimpase closed 11 years ago

dimpase commented 12 years ago

Delsarte bound for codes, aka Linear Programming bound, is easy to implement in Sage.

To work well in big dimensions, one needs an arbitrary precision LP solver. We use an LP backend to PPL, which is available in Sage since #12533.

Apply

Depends on #12533 Depends on #13650

CC: @sagetrac-jpang @kini @wdjoyner @nathanncohen @ppurka @ptrrsn

Component: coding theory

Author: Dmitrii Pasechnik

Reviewer: Frédéric Chapoton, Punarbasu Purkayastha

Merged: sage-5.12.beta1

Issue created by migration from https://trac.sagemath.org/ticket/12418

dimpase commented 12 years ago

Description changed:

--- 
+++ 
@@ -63,4 +63,6 @@
 weight enumerator of our code), then the output is the corr. dimension, i.e. 
 `floor(log(bound, q_base))`.

+One obstacle for this to work well in big dimensions is a lack of arbitrary precision LP solver backend available.  We are planning to add an [PPL](http://www.sagemath.org/doc/reference/sage/libs/ppl.html) backend (PPL is a polyhedral library with a Cython interface and arbitrary precision rational arithmetic, already a standard package in Sage). Once a ticked on this is opened, we'll cite it here. 

+
6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 12 years ago
comment:5

(GLPK can solve non-integer rational LP. It is not exposed, but may not be too hard either)

ppurka commented 12 years ago
comment:7

The function named delsarte_bound should be renamed to something like delsarte_bound_hamming_space. This is so that in future other functions like delsarte_bound_johnson_space, delsarte_bound_permutation_space, etc can be added easily, without having inconsistencies in naming.

dimpase commented 12 years ago

Description changed:

--- 
+++ 
@@ -63,6 +63,6 @@
 weight enumerator of our code), then the output is the corr. dimension, i.e. 
 `floor(log(bound, q_base))`.

-One obstacle for this to work well in big dimensions is a lack of arbitrary precision LP solver backend available.  We are planning to add an [PPL](http://www.sagemath.org/doc/reference/sage/libs/ppl.html) backend (PPL is a polyhedral library with a Cython interface and arbitrary precision rational arithmetic, already a standard package in Sage). Once a ticked on this is opened, we'll cite it here. 
+One obstacle for this to work well in big dimensions is a lack of arbitrary precision LP solver backend available in Sage. This is (almost - i.e. the corresponding ticket is still not 100% ready, as reviewers think) taken care of by #12533, which should be made a dependence for this ticket.
dimpase commented 12 years ago

Dependencies: #12533

dimpase commented 12 years ago

Description changed:

--- 
+++ 
@@ -1,68 +1,7 @@
 Delsarte bound for codes, aka Linear Programming bound, is easy to implement in Sage.
-Here is a quick and dirty code that does it; so the problem would be to integrate this properly.

-```
-def Kra(n,q,l,i): # K^{n,q}_l(i), i.e Krawtchouk polynomial
-   return sum([((-1)**j)*((q-1)**(l-j))*binomial(i,j)*binomial(n-i,l-j) 
-                              for j in range(l+1)])
+See the attached prototype implementation for details.

-def roundres(x): # this is a quick and unsafe way to round the result...
-   import math
-   tol = 0.0001
-   if math.ceil(x)-x < tol:
-      return int(math.ceil(x))
-   if x-math.floor(x) < tol:
-      return int(math.floor(x))
-   return x
-
-# @cached_function
-def delsarte_bound(n, q, d, d_star=1, q_base=0, return_log=True,\
-                    isinteger=False, return_data=False):
-   p = MixedIntegerLinearProgram(maximization=True)
-   A = p.new_variable(integer=isinteger) # A>=0 is assumed
-   p.set_objective(sum([A[r] for r in range(n+1)])) 
-   p.add_constraint(A[0]==1)
-   for i in range(1,d):
-      p.add_constraint(A[i]==0)
-   for j in range(1,n+1): 
-      rhs = sum([Kra(n,q,j,r)*A[r] for r in range(n+1)])
-      if j >= d_star: 
-        p.add_constraint(0*A[0] <= rhs) 
-      else: # rhs is proportional to j-th weight of the dual code
-        p.add_constraint(0*A[0] == rhs) 
-   try:
-      bd=p.solve()
-   except sage.numerical.mip.MIPSolverException, exc:
-      print "Solver exception: ", exc, exc.args
-      if return_data:
-         return A,p,False
-      return False
-   if q_base > 0: # rounding the bound down to the nearest power of q_base,
-                  # for q=q_base^m
-#      qb = factor(q).radical()
-#      if len(qb) == 1:
-#         base = qb.expand()
-#         bd = base^int(log(bd, base=base))
-         if return_log: 
-#            bd = int(log(roundres(bd), base=q_base)) # unsafe: 
-                                                      # loss of precision
-            bd = roundres(log(bd, base=q_base))
-         else:
-#            bd = q_base^int(log(roundres(bd), base=q_base))
-            bd = q_base^roundres(log(bd, base=q_base))
-   if return_data:
-      return A,p,bd
-   return bd
-```
-
-d_star is the minimal distance  of the dual code (if it exists at all) 
-If q_base is 0, just compute the upper bound.
-If q_base is >0, it is assumed to be a prime power, and the code assumed
-to be additive over this field (i.e. the dual code exists,and its weight enumerator is
-obtained by applying `MacWilliams transform` --- the matrix A of the LP times the
-weight enumerator of our code), then the output is the corr. dimension, i.e. 
-`floor(log(bound, q_base))`.
-
-One obstacle for this to work well in big dimensions is a lack of arbitrary precision LP solver backend available in Sage. This is (almost - i.e. the corresponding ticket is still not 100% ready, as reviewers think) taken care of by #12533, which should be made a dependence for this ticket.
+One obstacle for this to work well in big dimensions is a lack of arbitrary precision LP solver backend available in Sage. This is (almost - i.e. the corresponding ticket is still not 100% ready, as reviewers think) taken care of by #12533, which is a dependence for this ticket.
dimpase commented 12 years ago

a prototype implementation

dimpase commented 12 years ago
comment:11

Attachment: delsarte_bound.py.gz

dimpase commented 12 years ago

Description changed:

--- 
+++ 
@@ -2,6 +2,6 @@

 See the attached prototype implementation for details.

-One obstacle for this to work well in big dimensions is a lack of arbitrary precision LP solver backend available in Sage. This is (almost - i.e. the corresponding ticket is still not 100% ready, as reviewers think) taken care of by #12533, which is a dependence for this ticket.
+One obstacle for this to work well in big dimensions is a lack of arbitrary precision LP solver backend available in Sage. This is  taken care of by #12533, which is a dependence for this ticket.

-
+Apply [this patch](https://github.com/sagemath/sage-prod/files/10654716/12418_delsart_bounds.patch.gz)
ppurka commented 11 years ago
comment:12

I think the Krawtchouk polynomial could be computed explicitly by not making repeated calls to binomial. This should speed it up. I have something like this in mind:

def Krawtchouk2(n,q,l,i):
    # Use the expression in equation (55) of MacWilliams & Sloane, pg 151
    # We write jth term = some_factor * (j-1)th term
    kraw = jth_term = (q-1)**l * binomial(n, l) # j=0
    for j in range(1,l+1):
        jth_term *= -q*(l-j+1)*(i-j+1)/((q-1)*j*(n-j+1))
        kraw += jth_term
    return kraw

n,q,l,i = 10,8,7,5
timeit('Krawtchouk2(n,q,l,i)')
timeit('Krawtchouk (n,q,l,i)')
print Krawtchouk2(n,q,l,i) == Krawtchouk(n,q,l,i)

625 loops, best of 3: 53.3 µs per loop
625 loops, best of 3: 205 µs per loop
True

I noticed that sage handles nonintegral components in the binomial, so the expression for the Krawtchouk already works with nonintegral n and x.

n,q,l,i = 10.6,8,7,5.4
#timeit('Krawtchouk3(n,q,l,i)')
timeit('Krawtchouk2(n,q,l,i)')
timeit('Krawtchouk (n,q,l,i)')
print Krawtchouk2(n,q,l,i) == Krawtchouk(n,q,l,i)
print Krawtchouk2(n,q,l,i), Krawtchouk(n,q,l,i)

625 loops, best of 3: 382 µs per loop
125 loops, best of 3: 4.74 ms per loop
False
93582.0160001147 93582.0159999999
ppurka commented 11 years ago
comment:13

Can you mention when it guarantees a weight spectrum? Would doing an ILP make it a proper weight spectrum?

   - ``return_data`` -- if ``True``, return a weights vector, which actually need not 
     be a proper weight enumerator, or even have integer entries, and the LP. 

Also, I think the term weight enumerator refers to the weight enumerator polynomial. Perhaps using weight distribution or distance distribution is more appropriate here.

dimpase commented 11 years ago
comment:14

Replying to @ppurka:

I think the Krawtchouk polynomial could be computed explicitly by not making repeated calls to binomial. This should speed it up.

It's probably even faster to compute by using recurrence relations, but I don't think it's important here: LP solving timing clearly dominates the rest.

By the way, would it be interesting to include an option to compute bounds on codes with a prescribed forbidden set of distances, rather than just [1..d] ? It's a trivial add-on. I did this in a prototype code for Johnson schemes, here.

Any other interesting schemes to include? (Johnson scheme takes care of constant weight binary codes, as you know.)

dimpase commented 11 years ago

Description changed:

--- 
+++ 
@@ -4,4 +4,4 @@

 One obstacle for this to work well in big dimensions is a lack of arbitrary precision LP solver backend available in Sage. This is  taken care of by #12533, which is a dependence for this ticket.

-Apply [this patch](https://github.com/sagemath/sage-prod/files/10654716/12418_delsart_bounds.patch.gz)
+Apply [attachment: 12418_delsart_bounds.patch](https://github.com/sagemath/sage-prod/files/10654716/12418_delsart_bounds.patch.gz)
dimpase commented 11 years ago

Changed dependencies from #12533 to #12533, #13650

ppurka commented 11 years ago
comment:16

Replying to @dimpase:

Replying to @ppurka:

I think the Krawtchouk polynomial could be computed explicitly by not making repeated calls to binomial. This should speed it up.

It's probably even faster to compute by using recurrence relations, but I don't think it's important here: LP solving timing clearly dominates the rest.

The point is that someone might try to use these polynomials more generally in a separate context. They are not defined anywhere else in Sage, so anyone who tries to use them will use this one.

By the way, would it be interesting to include an option to compute bounds on codes with a prescribed forbidden set of distances, rather than just [1..d] ? It's a trivial add-on. I did this in a prototype code for Johnson schemes, here.

Wow! You have the Johnson scheme too?! Sure, add them all in!! Do you use the polynomials used by Aaltonen?

Any other interesting schemes to include? (Johnson scheme takes care of constant weight binary codes, as you know.)

LP for permutation codes would be interesting. There are not too many good results known there. IIRC, it uses Charlier polynomials.

EDIT: FWIW, it is Charlier polynomials.

ppurka commented 11 years ago
comment:17

Oh, I forgot to add. Forbidden distances will be nice as well. I think only some special cases achieve the closed form solutions. In general, still not much is known. It looks like you only need to drop distances d_1,...,d_m instead of 1,...,d, right?

How about introducing an extra keyword called forbidden_distances or exclude_distances, which defaults to 1,...,d?

dimpase commented 11 years ago
comment:18

Replying to @ppurka:

Replying to @dimpase:

Replying to @ppurka:

I think the Krawtchouk polynomial could be computed explicitly by not making repeated calls to binomial. This should speed it up.

It's probably even faster to compute by using recurrence relations, but I don't think it's important here: LP solving timing clearly dominates the rest.

The point is that someone might try to use these polynomials more generally in a separate context. They are not defined anywhere else in Sage, so anyone who tries to use them will use this one.

Actually, I have most discrete orthogonal polynomials arising in the classical P- and Q- polynomial schemes implemented, although it's neither polished nor optimized.

[

By the way, would it be interesting to include an option to compute bounds on codes with a prescribed forbidden set of distances, rather than just [1..d] ? It's a trivial add-on. I did this in a prototype code for Johnson schemes, here.

Wow! You have the Johnson scheme too?! Sure, add them all in!! Do you use the polynomials used by Aaltonen?

I use the descriptions in the book "Algebraic Combinatorics I" by E.Bannai and T.Ito.
Something known as Eberlein polynomials.

Any other interesting schemes to include? (Johnson scheme takes care of constant weight binary codes, as you know.)

LP for permutation codes would be interesting. There are not too many good results known there. IIRC, it uses Chebychev polynomials(?).

yes, this should be perfectly doable.

ppurka commented 11 years ago
comment:19

Replying to @dimpase:

I use the descriptions in the book "Algebraic Combinatorics I" by E.Bannai and T.Ito.
Something known as Eberlein polynomials.

That's for the binary case. For the q-ary case it is a product of Krawtchouk and Hahn, if I recall properly. Let me fish out the paper; I will send it to you.

dimpase commented 11 years ago
comment:20

Replying to @ppurka:

I think the Krawtchouk polynomial could be computed explicitly by not making repeated calls to binomial. This should speed it up. I have something like this in mind:

This can be further optimized by using Horner's rule. I'll do this, and leave the rest (other schemes) for another ticket, OK?

ppurka commented 11 years ago
comment:21

Replying to @dimpase:

This can be further optimized by using Horner's rule. I'll do this, and leave the rest (other schemes) for another ticket, OK?

Yes, yes. One space/polynomial at a time. Just Hamming space in this ticket is OK.

dimpase commented 11 years ago

Attachment: update.diff.gz

update of the patch - for reviewing only

dimpase commented 11 years ago
comment:22

Please review. I added a Kravchouck speedup, and cleaned up docstrings as requested.

dimpase commented 11 years ago

Description changed:

--- 
+++ 
@@ -1,7 +1,5 @@
 Delsarte bound for codes, aka Linear Programming bound, is easy to implement in Sage.

-See the attached prototype implementation for details.
-
-One obstacle for this to work well in big dimensions is a lack of arbitrary precision LP solver backend available in Sage. This is  taken care of by #12533, which is a dependence for this ticket.
+To work well in big dimensions, one needs an arbitrary precision LP solver. We use an LP backend to PPL, which is available in Sage since #12533.

 Apply [attachment: 12418_delsart_bounds.patch](https://github.com/sagemath/sage-prod/files/10654716/12418_delsart_bounds.patch.gz)
dimpase commented 11 years ago
comment:23

improved docstrings for return_data

ppurka commented 11 years ago
comment:24

Thanks for the update. I have some general comments. Will look into this patch in more detail too.

  1. There are lot of trailing whitespaces. The patchbot will complain. :)
  2. What is the point of this portion of the code? Can't it be replaced by kk = ZZ(log(q, q_base))?
   kk = 0
   while q_base**kk < q:
      kk += 1
  1. There is another bit further down:
      m = -1
      while q_base**(m+1) < bd:
        m += 1
      if q_base**(m+1) == bd:
        m += 1
  1. Also, I don't think this deprecation is necessary any more. The ticket you cited is over 2 years old.
-def dimension_upper_bound(n,d,q):
+@rename_keyword(deprecation=6094, method="algorithm")
+def dimension_upper_bound(n,d,q,algorithm=None):

Edit: Sorry. It seems ZZ doesn't work but int(log(..)) does work.

dimpase commented 11 years ago
comment:25

Replying to @ppurka:

Thanks for the update. I have some general comments. Will look into this patch in more detail too.

  1. There are lot of trailing whitespaces. The patchbot will complain. :)

I've just uploaded an update with all the trailing spaces removed.

  1. What is the point of this portion of the code? Can't it be replaced by kk = ZZ(log(q, q_base))?
   kk = 0
   while q_base**kk < q:
      kk += 1
  1. There is another bit further down:
      m = -1
      while q_base**(m+1) < bd:
        m += 1
      if q_base**(m+1) == bd:
        m += 1

this came from an older piece of plain Python. Then I struggled with log() quite a bit, and finally gave up on it and rolled my own.

  1. Also, I don't think this deprecation is necessary any more. The ticket you cited is over 2 years old.
-def dimension_upper_bound(n,d,q):
+@rename_keyword(deprecation=6094, method="algorithm")
+def dimension_upper_bound(n,d,q,algorithm=None):

I blindly copied from sage/coding/code_bounds.py Should the whole file be cleaned out of these? By the way, plural methods slipped through this decorator...

Edit: Sorry. It seems ZZ doesn't work but int(log(..)) does work.

dimpase commented 11 years ago
comment:26

Replying to @ppurka:

Thanks for the update. I have some general comments. Will look into this patch in more detail too.

  1. What is the point of this portion of the code? Can't it be replaced by kk = ZZ(log(q, q_base))?

yes, it is basically what it does; this is also needed to do a Gomory-style cut which might be available due to the corresponding rounding.

ppurka commented 11 years ago
comment:27

Ok. I am finally getting some time to look into this again. Here are some comments.

  1. Krawtchouk is missing a doctest.
  2. There is no need of \ in def delsarte_bound_...
  3. There are still many trailing whitespaces. If you use vim then you can try this command :%s/[ ]\+$//
  4. - ``q`` -- the length of the alphabet -- this should be "the size of the alphabet" Also I think the the options q and d can be swapped to be in the order in which they appear in the function definition.
  5. - ``solver`` -- the LP/ILP solved --- this should be solver. What other solvers are present? They should be listed as options in this variable.
  6. The bound on the size of the F_2-codes --- this should be `F_2`
  7. - ``return_data`` --- As it is currently written in the description, it is unclear what this is returning. It looks like the first component is an MIP variable (and not a vector), and the second component is the MILP itself. At a first glance in your doctests, it looks like we need to understand how the MILP works in order to get the values of the weight distribution. Should we just return the weight distribution itself? A weight distribution vector can be returned by doing p.get_values(a).values().
  8. The backend should automatically handle "isinteger=True" at least so that it is functional. Currently I get a very generic Exception (why is this only "Exception"?). Maybe it is automatically handled in a later version of sage? I will have to check against 5.6.beta1 and higher:
Exception: This backend does not handle integer variables ! Read the doc !
  1. Why do we need the second function? We already discussed this off-ticket - I will wait for your generic function.
  2. (**this option is currenly disabled, cf. trac #13667**). --- it should be :trac:`13667` since it will be in the documentation.
  3. Parameter "algorithm" has the same meaning as in codesize_upper_bound() --- This should be :func:`codesize_upper_bound`.
  4. print "Wrong q_base=", q_base, " for q=", q, kk --- This can be formatted python3 style as
      print "Wrong q_base={} for q={} {}".format(q_base, q, kk)
  1. According to the patchbot this needs rebasing against some higher version of 5.6. I have only beta0 here; I will check it against rc1 after I have compiled that.
  2. EDIT: I forgot.. the deprecation notice should go. It has been two years. If code_bounds needs to be cleaned then I can do that.

EDIT: Update backticks in 6., 10., 11.

KPanComputes commented 11 years ago
comment:28

This patch needs to address referee's comments. I am changing this to needs_work in the meanwhile.

dimpase commented 11 years ago

Attachment: 12418_delsart_bounds.2.patch.gz

rebased for Sage 5.10 and fixed some outstanding issues

dimpase commented 11 years ago
comment:29

Replying to @ppurka:

rebased to Sage 5.10 and fixed the following:

  1. Krawtchouk is missing a doctest.
  2. There is no need of \ in def delsarte_bound_...
  3. There are still many trailing whitespaces. If you use vim then you can try this command :%s/[ ]\+$//
  4. - ``q`` -- the length of the alphabet -- this should be "the size of the alphabet" Also I think the the options q and d can be swapped to be in the order in which they appear in the function definition.

I swapped the docstrings instead.

  1. - ``solver`` -- the LP/ILP solved --- this should be solver. What other solvers are present? They should be listed as options in this variable.
  2. The bound on the size of the F_2-codes --- this should be `F_2`

(and other F_)

  1. The backend should automatically handle "isinteger=True" at least so that it is functional. Currently I get a very generic Exception (why is this only "Exception"?). Maybe it is automatically handled in a later version of sage? I will have to check against 5.6.beta1 and higher:

The PPL backend does not handle ILP. One might want to improve the way it reports this error, but not on this ticket.

Exception: This backend does not handle integer variables ! Read the doc !
  1. Parameter "algorithm" has the same meaning as in codesize_upper_bound() --- This should be :func:`codesize_upper_bound`.
  2. According to the patchbot this needs rebasing against some higher version of 5.6. I have only beta0 here; I will check it against rc1 after I have compiled that.

Rebased.

dimpase commented 11 years ago

Work Issues: code refactoring

ppurka commented 11 years ago
comment:31

Replying to @dimpase:

Work issues set to code refactoring

Oh good. I was wondering if you didn't want to do that in this patch.

dimpase commented 11 years ago

rebased for Sage 5.10 and fixed some outstanding issues

dimpase commented 11 years ago

Attachment: 12418_delsart_bounds.patch.gz

Attachment: 12418_refactored.diff.gz

code refactoring - for review only

dimpase commented 11 years ago

Changed work issues from code refactoring to none

fchapoton commented 11 years ago
comment:33

instructions for the bot:

apply 12418_delsart_bounds.patch

fchapoton commented 11 years ago
comment:34
dimpase commented 11 years ago
comment:35

Replying to @fchapoton:

  • doctest covering is not 100%

hmm, what function do you mean? there is an internal function which is not exported; I don't think it needs a doctest, does it?

  • you can use the wikipedia role for example :wikipedia:`Togo`

where?

  • it would be better to lazy_import the new functions, maybe ?

I have no idea. Is there a stated policy on this? Having said this, perhaps it's better to re-lazy_import the whole sage/coding, something for another ticket?

fchapoton commented 11 years ago
comment:36

you can write :wikipedia:`Kravchuk_polynomials`

dimpase commented 11 years ago

Attachment: 12418_update.patch.gz

update to fix the remaining issues

dimpase commented 11 years ago

Description changed:

--- 
+++ 
@@ -2,4 +2,7 @@

 To work well in big dimensions, one needs an arbitrary precision LP solver. We use an LP backend to PPL, which is available in Sage since #12533.

-Apply [attachment: 12418_delsart_bounds.patch](https://github.com/sagemath/sage-prod/files/10654716/12418_delsart_bounds.patch.gz)
+Apply
+
+* [attachment: 12418_delsart_bounds.patch](https://github.com/sagemath/sage-prod/files/10654716/12418_delsart_bounds.patch.gz)
+* [attachment: 12418_update.patch](https://github.com/sagemath/sage-prod/files/10654718/12418_update.patch.gz)
ppurka commented 11 years ago
comment:38

The patch looks OK to me now. Though I would have really liked the Q-matrix to be passed to the linear program (for instance to the delsarte LP building function), this can be done in a future patch when LP for other spaces are introduced.

ppurka commented 11 years ago

Author: Dmitrii Pasechnik

ppurka commented 11 years ago

Reviewer: Frédéric Chapoton, Punarbasu Purkayastha

jdemeyer commented 11 years ago

Merged: sage-5.12.beta1