is_difference_matrix - Githubissues

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

Spent a lifetime on that because I had not noticed that I was first enumerating elements using a list L, then enumerating the same elements using {x:i for i,x in enumerate(L)}. And of course that was not the same ordering T_T

Nathann

CC: @videlec

Component: combinatorial designs

Author: Nathann Cohen

Branch/Commit: dee4275

Reviewer: Vincent Delecroix

Issue created by migration from https://trac.sagemath.org/ticket/16797

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

Branch: u/ncohen/16797

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Commit: e76b5ab

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

e76b5ab trac #16797: is_difference_matrix

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from e76b5ab to 8f10d94

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

8f10d94 trac #16797: is_difference_matrix

videlec commented 10 years ago

comment:4

Hi Nathann,

Why do you use this strange convention about rows/cols? From the Handbook chapter 17, there are k rows and lambda |G| columns...

Vincent

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

comment:5

Why do you use this strange convention about rows/cols? From the Handbook chapter 17, there are k rows and lambda |G| columns...

Because in the OA it is the columns that must be orthogonal ...

Nathann

videlec commented 10 years ago

comment:6

Hum... from the documentation of OA_from_quasi_difference_matrix

    Let `G` be a group of order `g`. A *difference matrix* `M` is a `k \times g`
    matrix with entries from `G` such that for any `1\leq i < j < k` the set
    `\{d_{il}-d_{jl}:1\leq l \leq g\}` is equal to `G`.

Vincent

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

comment:7

I think that it will be easier to work with k columns. Look at the OA code, we spend our lifetime creating rows for every set of a PBD, or developping rows according to some groups... For instance look at the doc you just wrote for the helper function : you have one row in your matrix, and you let this H hyperplane act on it to generate new rows.

If you want to add columns instead it becomes a mess, you have to extend each column instead of just adding a row to the matrix.

By the way, many cals to OA_from_quasi_difference_matrix are already preceded by a M=zip(*M) # transpose M. Actually, I am looking at database.py and really a LOT of code would be simplified by changing that: look how often you have code like that to create the matrices:

for <something>:
    M[0].append(e)
    M[1].append(f)
    ....

If all that was the transpose instead, we would just have to write:

for <something>:
    M.append([e,f,...])

What do you think ?

Nathann

videlec commented 10 years ago

comment:8

I am not against having the difference matrix the other way around (actually I do not care). But then:

rewrite the documentation appropriately and warn the reader that the convention is different from the Handbook,
modify all functions that use difference matrix

Vincent

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

comment:10

Okay, then I will do this after the current patches have been reviewed. Otherwise we will have to deal with countless conflicts again. I write this to follow the current convention in Sage and will change it later, along with the implemented code.

Nathann

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

aa097f6 trac #16797: Fit with the current definition until we can change everything at once

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 8f10d94 to aa097f6

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

comment:13

I added a small check before M = zip(*M) because this operation takes the min of the length of the lists in M:

sage: zip([1],[2],[3,4,5,6])
[(1, 2, 3)]

Nathann

videlec commented 10 years ago

comment:14

Hello,

1) Why do you transpose the matrix to check the definition? You would avoid many multiplications in x_minus_y[M_c[ii*K+i]][M_c[ii*K+j]].

2) I do not like the order of the argument. For orthogonal arrays it is OA, k, n, but here it is G, k, M... I would prefer G,M,k.

3) We can win a lot of time with a C array for x_minus_y and seen_values. It is at u/vdelecroix/16797

Vincent

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

comment:15

Yo !

1) Why do you transpose the matrix to check the definition?

When we will change the definition to have k columns instead of k rows we will just have to remove the block of lines before the transposition.

You would avoid many multiplications in x_minus_y[M_c[ii*K+i]][M_c[ii*K+j]].

I am not sure that we pay these multiplications. Because of your experience with is_sum_of_square and my useless optimization in is_orthogonal_array

2) I do not like the order of the argument. For orthogonal arrays it is OA, k, n, but here it is G, k, M... I would prefer G,M,k.

If you prefer ...

3) We can win a lot of time with a C array for x_minus_y and seen_values. It is at u/vdelecroix/16797

I was not sure that it would make a difference, thus I kept the Cython version. Do you have timings ?

Nathann

videlec commented 10 years ago

comment:16

old

sage: from sage.combinat.designs.designs_pyx import is_difference_matrix
sage: q = 5**3
sage: F = GF(q,'x')
sage: M = [[x*y for y in F] for x in F]
sage: timeit("is_difference_matrix(F,q,M)")
5 loops, best of 3: 92 ms per loop

new

sage: from sage.combinat.designs.designs_pyx import is_difference_matrix
sage: q = 5**3
sage: F = GF(q,'x')
sage: M = [[x*y for y in F] for x in F]
sage: timeit("is_difference_matrix(F,q,M)")
25 loops, best of 3: 11.9 ms per loop

Vincent

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

comment:17

Hmmmmmmmmm.. I wonder how much come from the C version of x_minus_y instead of Cython. I'll do a test.

Nathann

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

comment:18

HMmmmmm O_o

Well, with the Cython version of x_minus_y (at public/16797) your example takes 15ms, but what I get when I try with your branch is a segfault

sage: sage: from sage.combinat.designs.designs_pyx import is_difference_matrix
sage: sage: q = 5**3
sage: sage: F = GF(q,'x')
sage: sage: M = [[x*y for y in F] for x in F]
sage: sage: timeit("is_difference_matrix(F,q,M)")
------------------------------------------------------------------------
/home/ncohen/.Sage/local/lib/libcsage.so(print_backtrace+0x31)[0x7fca2ebe7af1]
/home/ncohen/.Sage/local/lib/libcsage.so(sigdie+0x1e)[0x7fca2ebe7c85]
/home/ncohen/.Sage/local/lib/libcsage.so(sage_signal_handler+0x1cb)[0x7fca2ebe7545]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf8d0)[0x7fca33d788d0]
...

Nathann

videlec commented 10 years ago

comment:19

Stupid error in the malloc. I guess that on your computer sizeof(int) and sizeof(int *) are different.

videlec commented 10 years ago

comment:20

(updated on my branch)

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

comment:21

Ahaah Okay, 9.12ms : you win.

God it's hard to know when to optimize :-p

Nathann

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

comment:22

Hmmm... The way you manage memory, you don't free any of the x_minus_y if there is a problem during the allocation of G_seen. By the way you could spare yourself a loop of alloc/free if you do not allow a vector of length G_card for each entry of x_minus_y but rather alloc a big table of size G_card**2 and then let int ** x_minus_y point toward parts of it.

Nathann

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

comment:23

Oh, and could you also give more info when an element does not appear exactly lambda times ? The way you wrote the code you can say which element it is and how many times it appeared.

Nathann

videlec commented 10 years ago

comment:24

I am on it!!

Vincent

videlec commented 10 years ago

comment:25

done at u/vdelecroix/16797.

Vincent

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

comment:26

You still don't deallocate x_minus_y when there is a problem with other mallocs.

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

comment:27

Okay I'll do it.

Nathann

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

comment:28

Hey I just re-read your comment above: if we want to copy is_orthogonal_array's input we should make M be the first parameter, shouldn't we ?

Nathann

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from aa097f6 to b78c347

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`5d21b8c`	`trac #16797: int * and ** instead of Python list`
`8f29bd1`	`trac #16797: change int to int * in a malloc`
`811e3e9`	`trac #16797: better malloc + better error msg`
`1b8c2e3`	`trac #16797: More compact mallocs`
`b78c347`	`trac #16797: Reorder the arguments to copy is_orthogonal_array`

videlec commented 10 years ago

comment:31

Replying to @nathanncohen:

Hey I just re-read your comment above: if we want to copy is_orthogonal_array's input we should make M be the first parameter, shouldn't we ?

I don't know. For difference family it is D,G and not G,D.

I like the rest... do what you want with the input and set to positive review.

(I am going for a bike run.... back in 3 hours)

Vincent

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

comment:32

Yoooooo !

I like the rest... do what you want with the input and set to positive review.

(I am going for a bike run.... back in 3 hours)

Okayyyy, have fun ! And thanks for this patch !

Nathann

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

Reviewer: Vincent Delecroix

videlec commented 10 years ago

comment:33

I have a problem

sage: B
[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 1, 2, 3, 4, 2, 3, 4, 0, 1],
 [0, 2, 4, 1, 3, 3, 0, 2, 4, 1],
 [0, 3, 1, 4, 2, 3, 1, 4, 2, 0],
 [0, 4, 3, 2, 1, 2, 1, 0, 4, 3],
 [0, 1, 4, 4, 1, 0, 2, 3, 3, 2],
 [1, 0, 1, 4, 4, 2, 0, 2, 3, 3],
 [4, 1, 0, 1, 4, 3, 2, 0, 2, 3]]
sage: sage: is_difference_matrix(B,GF(5),8,2,verbose=True)
The matrix has 8 columns and lambda.|G|=2.5=10
False

There are 8 rows and 10 columns...

Vincent

videlec commented 10 years ago

comment:34

All right, it is because M_nrows is initialized to soon. I will add the example in the doc.

Vincent

videlec commented 10 years ago

comment:35

See at u/vdelecroix/16797.

Vincent

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

comment:37

Thaaanks ! Then it can get in.

By the way, I love "the future". Back in the old phone modem days, I wouldn't have thought that I'd be reviewing a trac ticket from a laptop in a car on the highway using a phone as a modem :-PPP

Nathann

vbraun commented 10 years ago

comment:39

Merge conflict, please fix

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from b78c347 to dee4275

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

dee4275 trac #16797: Merged with 6.4.beta0

vbraun commented 10 years ago

Changed branch from u/ncohen/16797 to dee4275

sagemath / sage

is_difference_matrix #16797