sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.41k stars 475 forks source link

Faster is_orthogonal_array #16295

Closed 6bdad4c1-1e26-4f2f-a442-a01a2292c181 closed 10 years ago

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

As the ticket claims this ticket implements a Cython version of is_orthogonal_array, which is then called by is_transversal_design and are_mutually_orthogonal_latin_squares.

Also sets a check=False in the code which slowed things down. As a resultthe tests are checked faster in the designs/ folder !

Nathann

Depends on #16236

CC: @videlec @KPanComputes @dimpase

Component: combinatorial designs

Author: Nathann Cohen

Branch/Commit: 02f1247

Reviewer: Vincent Delecroix, Brett Stevens

Issue created by migration from https://trac.sagemath.org/ticket/16295

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

Branch: u/ncohen/16295

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

Last 10 new commits:

31a53f2trac #16235: update the MOLS table
1a13ff8trac #16241: New MOLS shared by Ian Wanless
2af2acftrac #16241: missing links and tests
7e77f90trac #16241: Broken doctests
83b0d2ctrac #16241: Merged with updated #16235
c212bf9trac #16241: Broken doctests
e655b36trac #16236: Five mutually orthogonal latin squares of order 12
b516266trac #16236: Merged with updated #16241
f5339fatrac #16236: Broken doctests
0d5c222trac #16295 : Faster is_orthogonal_array
6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

Commit: 0d5c222

65d8ac20-a56f-4d80-91b0-a0e64981c5c9 commented 10 years ago
comment:4

Nathann,

The code looks good and I have tested on an example different from your doc test.

The following run fine:

make, doc-test, docbuild html, code coverage is 100%

I had trouble making the pdf documentation and then went back to devel branch to check it worked Ok there. It did and then when I moved back to u/ncohen/16295 it worked. I am not sure what happened

I think this needs two things:

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago
comment:5

Hellllooooo !!

I had trouble making the pdf documentation and then went back to devel branch to check it worked Ok there. It did and then when I moved back to u/ncohen/16295 it worked. I am not sure what happened

Yeah, we had the same problems just one hour ago when Volker tried to merge #16277 or #16235... And it does not happen again because Sphinx is totall unreliable. I expect it to ocome from the dependencies, not from this ticket

I think this needs two things:

  • it needs to be as consistent as possible with the purely python version inside orthogonal_arrays.py. That is, it will react the same way on the same input. This change should be done and then a doc-test written to verify it.

Once this patch is applied there is no python version inside of orthogonal_arrays.py, as this patch removes it. We can add any doctest you want, but what could it be ? Remember that this function is already tested a LOT by all constructions of OA/TD/MOLS as it is run on those instances every time one is built.

  • it needs to have a doctest that demonstrates the speed improvement that cycthon gives over the purely python version inside orthogonal_arrays.py.

Once more, this function removes the purely python one.

Nathann

videlec commented 10 years ago
comment:6

Hi Nathann,

You should avoid using any in Cython. This is not properly cythonized. The following is better by 15%:

for i in range(len(OA)):
    if len(OA[i]) != k:
        if verbose:
            print {"OA"   : "Some row does not have length "+str(k),
                   "MOLS" : "The number of matrices is not "+str(k)}[terminology]

If you do so, you need to move up the cdef int i,j,l.

Vincent

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago
comment:7

Yoooooooo !!

You should avoid using any in Cython. This is not properly cythonized. The following is better by 15%:

It is good to know, but I really do not think that this is the critical part here :-D

Nathann

65d8ac20-a56f-4d80-91b0-a0e64981c5c9 commented 10 years ago
comment:8

Sorry, I had changed branches and then opened orthogonal_array.py in an editor and so thought is_orthogonal_array was still there.

Here is my timing data

with Cython:

sage: time is_orthogonal_array(OA,8,9)
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 27.2 µs
False
sage: time is_orthogonal_array(OA,12,11)
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 239 µs
True

with Python:

sage: time is_orthogonal_array(OA,8,9,2)
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 29.1 µs
False
sage: time is_orthogonal_array(OA,12,11,2)
CPU times: user 12 ms, sys: 4 ms, total: 16 ms
Wall time: 13.1 ms
True

this shows a significant improvement in speed.

However I think this needs to be fixed:

sage: is_orthogonal_array(OA,12,11,3)
True

If it is passed t=3 it should at least alert the user that it is going to override this with t=2

Here is what I think should be done:

65d8ac20-a56f-4d80-91b0-a0e64981c5c9 commented 10 years ago
comment:9

p.s. where is sage_free documented?

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago
comment:10

Hello !

  • fix t>2 bug

Done.

  • remove "any" as Vincent suggested

Guys, it is very important that you understand that this is what is called "taste", and that it makes absolutely no difference in the runtimes. If you want to make changes like that, it would be cool to implement it yourself instead of asking me to do it for you.

I just did it, because I need the patch to get in quick.

  • implement a doc test for
    • all your bad input/format tests

Done.

  • memory test

I cannot test that.

  • t>2 test

Done.

Nathann

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. Last 10 new commits:

411a759trac #16286: Merged with updated #16277
11eff2ctrac #16235: Merged with 6.2
5a0e3fetrac #16235: Merged with #16231
c0b13c4trac #16235: Merged with updated #16286
9fb0f62trac #16241: useless if condition in MOLS constructor
67ab2d2trac #16241: Merged with updated #16235
2d7da8atrac #16236: Merged with updated #16241
973b926trac #16236: Merged with #16272
37681e2trac #16295 : Faster is_orthogonal_array
839acc5trac: Doctests and review
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 0d5c222 to 839acc5

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 839acc5 to 994324e

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

994324etrac #16295: Doctests and review
6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago
comment:13

Brett : the performance improvements are particularly nice for LARGE designs ! I implemented this because I wanted to check that we could indeed produce all the OA promised in our MOLS table, but some (in particularly prime powers, for k gets very large) took VERY long. With this, it is really faster and checking that the OA is an OA is not the bottleneck anymore when playing with designs.

Nathann

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago
comment:14

A bugfix. u=1 was disregarded by mistake in wilson_construction. Julian R. Abel spotted this from only looking at the MOLS table. That's scary O_O;;;;

Nathann

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 994324e to b9f8b03

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

fc6de73trac #16295: Merged with 6.3.beta1
b9f8b03trac #16295: bugfix in wilson's construction
65d8ac20-a56f-4d80-91b0-a0e64981c5c9 commented 10 years ago
comment:16

Vincent,

What does it mean that "any" is not properly cythonized. Does this just mean it is not efficient or does it mean there could be problems with future versions of cython?

brett

65d8ac20-a56f-4d80-91b0-a0e64981c5c9 commented 10 years ago
comment:17

To look at the new code for this patch, I updated master and develop. Then I tried

git checkout u/ncohen/16295
git pull --ff-only trac u/ncohen/16295

and recieved

From trac.sagemath.org:sage
 * branch            u/ncohen/16295 -> FETCH_HEAD
fatal: Not possible to fast-forward, aborting.

how do I update this branch?

thanks brett

65d8ac20-a56f-4d80-91b0-a0e64981c5c9 commented 10 years ago
comment:18

As suggestedin the developers guide, I just ran

$git checkout master
Switched to branch 'master'
Your branch is ahead of 'origin/master' by 1992 commits.
$ git reset --hard trac/master
fatal: ambiguous argument 'trac/master': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions

I am not sure what to do, the developers guide does not help here.

brett

dimpase commented 10 years ago
comment:19

Replying to @brettpim:

As suggestedin the developers guide, I just ran

$git checkout master

Why do you need this? This brings you to the stable Sage release (6.2). Of course you are miles ahead of it, being around 6.3.beta1 or even later.

dimpase commented 10 years ago
comment:20

Replying to @brettpim:

To look at the new code for this patch, I updated master and develop. Then I tried

git checkout u/ncohen/16295
git pull --ff-only trac u/ncohen/16295

how about doing this without --ff-only ?

65d8ac20-a56f-4d80-91b0-a0e64981c5c9 commented 10 years ago
comment:21

Replying to @dimpase:

Replying to @brettpim:

As suggestedin the developers guide, I just ran

$git checkout master

Why do you need this? This brings you to the stable Sage release (6.2). Of course you are miles ahead of it, being around 6.3.beta1 or even later.

I keep a current stable release branch for reference and stability if I need it. I think I have done something to it.

65d8ac20-a56f-4d80-91b0-a0e64981c5c9 commented 10 years ago
comment:22

Replying to @dimpase:

Replying to @brettpim:

To look at the new code for this patch, I updated master and develop. Then I tried

git checkout u/ncohen/16295
git pull --ff-only trac u/ncohen/16295

how about doing this without --ff-only ?

Ok I tried

$ git checkout u/ncohen/16295 
Switched to branch 'u/ncohen/16295'
$ git pull trac u/ncohen/16295
From trac.sagemath.org:sage
 * branch            u/ncohen/16295 -> FETCH_HEAD
Auto-merging src/sage/schemes/hyperelliptic_curves/monsky_washnitzer.py
Auto-merging src/sage/combinat/designs/orthogonal_arrays.py
CONFLICT (content): Merge conflict in src/sage/combinat/designs/orthogonal_arrays.py
Auto-merging src/sage/combinat/designs/latin_squares.py
Auto-merging src/sage/combinat/designs/designs_pyx.pyx
CONFLICT (add/add): Merge conflict in src/sage/combinat/designs/designs_pyx.pyx
Auto-merging src/sage/combinat/designs/block_design.py
Auto-merging src/module_list.py
Automatic merge failed; fix conflicts and then commit the result.

Have I somehow changed the files? I certainly did not intend to.

brett

dimpase commented 10 years ago
comment:23

Replying to @brettpim:

as you see switching to the branch master rather than already on branch master, and you're ahead of master by a zillion commits, you are most probably on develop, or on some other branch you got onto while reviewing a ticket.

dimpase commented 10 years ago
comment:24

Replying to @brettpim:

Replying to @dimpase:

Replying to @brettpim:

To look at the new code for this patch, I updated master and develop. Then I tried

git checkout u/ncohen/16295
git pull --ff-only trac u/ncohen/16295

how about doing this without --ff-only ?

Ok I tried

$ git checkout u/ncohen/16295 
Switched to branch 'u/ncohen/16295'
$ git pull trac u/ncohen/16295
From trac.sagemath.org:sage
 * branch            u/ncohen/16295 -> FETCH_HEAD
Auto-merging src/sage/schemes/hyperelliptic_curves/monsky_washnitzer.py
Auto-merging src/sage/combinat/designs/orthogonal_arrays.py
CONFLICT (content): Merge conflict in src/sage/combinat/designs/orthogonal_arrays.py
Auto-merging src/sage/combinat/designs/latin_squares.py
Auto-merging src/sage/combinat/designs/designs_pyx.pyx
CONFLICT (add/add): Merge conflict in src/sage/combinat/designs/designs_pyx.pyx
Auto-merging src/sage/combinat/designs/block_design.py
Auto-merging src/module_list.py
Automatic merge failed; fix conflicts and then commit the result.

Have I somehow changed the files? I certainly did not intend to.

are you updating from develop? Or master? On the other hand I just tried git trac pull 12345 (I am using git-trac-command, which I would recommend!) it is worked.

Perhaps you should try checking out develop and then trying again...

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago
comment:25

Guys guys, all this is happening because I rewrote history here. The only thing to do brett is to checkout my branch, or to checkout "develop" and then to pull my branch.

The copy of my branch that you have on your computer has been abandonned. If you do a checkout on it THEN pull my branch then you are trying to apply the same modifications twice, and it fails.

If you can delete the copy of my branch that you have on my computer, do it.

It is a bit hard to debug without having both hands on your keyboard, though :-/

Nathann

65d8ac20-a56f-4d80-91b0-a0e64981c5c9 commented 10 years ago
comment:26

Everything is good except I am still having trouble building the documentation

$sage -docbuild reference html
.
.
.
c_functions                                    writing output... [ 99%] tableaux
                                               writing output... [100%] words
[combinat ] None:38: WARNING: citation not found: Schiffmann
Error building the documentation.
Traceback (most recent call last):
  File "/home/brett/Projects/SAGE/sage/src/doc/common/builder.py", line 1477, in <module>
    getattr(get_builder(name), type)()
  File "/home/brett/Projects/SAGE/sage/src/doc/common/builder.py", line 487, in _wrapper
    x.get(99999)
  File "/home/brett/Projects/SAGE/sage/local/lib/python/multiprocessing/pool.py", line 554, in get
    raise self._value
OSError: [combinat ] None:38: WARNING: citation not found: Schiffmann

Do you get this error too Nathann?

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago
comment:27

Helloooooo Brett !!

No I don' get this error and apparently it does not come from the designs code but somewhere else.. I think you should run "make doc-clean && make" which will re-build the doc. Takes some time, but not tooo long either. Sorry, I think you hit a lot of walls on this tickets ^^;

Nathann

65d8ac20-a56f-4d80-91b0-a0e64981c5c9 commented 10 years ago
comment:28

Replying to @nathanncohen:

Helloooooo Brett !!

No I don' get this error and apparently it does not come from the designs code but somewhere else.. I think you should run "make doc-clean && make" which will re-build the doc. Takes some time, but not tooo long either. Sorry, I think you hit a lot of walls on this tickets ^^;

Nathann

It is not a problem. The coding part of this patch is clear and easy so this is a good ticket to hit the other kinds of problems that might show up when reviewing.

65d8ac20-a56f-4d80-91b0-a0e64981c5c9 commented 10 years ago
comment:29

make doc-clean && make worked fine. Does this build both html and pdf documentation?

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago
comment:30

I think it only build the html one. "make doc-pdf" builds the pdf version :-)

Nathann

83660e46-0051-498b-a8c1-f7a7bd232b5a commented 10 years ago
comment:31
    # Filling times_n
    for i in range(n+1):
        times_n[i] = i*n

Ahem, you care for speed? ;-)

(And there's no times_n2[] either.)

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago
comment:32

Ahem, you care for speed? ;-)

Yeah. And it turns out that I am totally incompetent.

When caching the multiplication

sage: OA=designs.orthogonal_array(None,2**7,check=False)
sage: %timeit is_orthogonal_array(OA,2**7+1,2**7,2)                    
1 loops, best of 3: 618 ms per loop

When you don't

sage: OA=designs.orthogonal_array(None,2**7,check=False)
sage: %timeit is_orthogonal_array(OA,2**7+1,2**7,2)                    
1 loops, best of 3: 375 ms per loop

.......................................................

(And there's no times_n2[] either.)

times_n2 ? What for ?

Anyway, this caching was the only thing keeping me from implementing this for all t, so now I will be able to implement it generally now. Cool -_-

Nathann

65d8ac20-a56f-4d80-91b0-a0e64981c5c9 commented 10 years ago
comment:33

Does this data show that it is faster to simply do the multiplications?

83660e46-0051-498b-a8c1-f7a7bd232b5a commented 10 years ago
comment:34

Replying to @brettpim:

Does this data show that it is faster to simply do the multiplications?

In general depends on your CPU and your (C) compiler. (And in conjunction with Cython, the kind of code the latter generates, i.e., whether the C compiler can "guess" what's meant or whether the code is to obfuscated.)

83660e46-0051-498b-a8c1-f7a7bd232b5a commented 10 years ago
comment:35

Replying to @nathanncohen:

times_n2 ? What for ?

It's used twice, in the outer loop and the first inner loop:

    for i in range(k): # For any column C1
        C1 = OAc+i*n2
        for j in range(i+1,k): # For any column C2 > C1
            C2 = OAc+j*n2
            ...

One would usually not look up multiples of n^2 (or n in the earlier case), but do

    C1 = OAc
    i = 0
    while i < k:

        ... # similar for C2, j

        C1 += n2
        i++

In C, you'd presumably drop j (and perhaps also i) altogether, and just compare the pointers to the address of the (current sub-)array end.

(And inlining the bitset operations would of course make it even faster. Afterwards special-case for n=2m ... :-) )

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago
comment:36

It's used twice, in the outer loop and the first inner loop:

Come on... This is not the bottleneck :-P

(And inlining the bitset operations would of course make it even faster. Afterwards special-case for n=2m ... :-) )

They may be inlined already ! I didn't check though.

Anyway. Let's remove this useless caching, and then I will go sleep a bit. I couldn't do this earlier because my laptop is totally breaking and my screen refused to turn on T_T

Nathann

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

2414b81trac #16295: Undoing useless """optimizations"""
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from b9f8b03 to 2414b81

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago
comment:39

Would anybody have some time for this ticket ? :-P

Nathann

65d8ac20-a56f-4d80-91b0-a0e64981c5c9 commented 10 years ago
comment:40

Nathann

I am in Finland visiting a colleague. My plan is to produce a patch on top of this that implements Vincent's code for "any" and does some small work on English and once I commit it and if you are OK with my changes we can finish with this patch. I am sorry I am working in spurts on these sage patches, but I am trying to make the most of my time with my colleague here. It will be soon.

regards brett

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago
comment:41

Hello !

I am in Finland visiting a colleague.

Okayyyyyyyyy.... !

My plan is to produce a patch on top of this that implements Vincent's code for "any"

This is already done, I updated the code last time and there is no "any" anymore.

and does some small work on English

Okay.

and once I commit it and if you are OK with my changes we can finish with this patch. I am sorry I am working in spurts on these sage patches, but I am trying to make the most of my time with my colleague here. It will be soon.

Got it ! Have fun there.

Nathann

83660e46-0051-498b-a8c1-f7a7bd232b5a commented 10 years ago
comment:42

Replying to @brettpim:

My plan is to produce a patch on top of this that [...] does some small work on English and once I commit it and if you are OK with my changes we can finish with this patch.

Yes, there are some typos, and the first sentence of the docstring refers to M while the parameter's name is OA.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

f1c7fa1trac #16295: Typo
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 2414b81 to f1c7fa1

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

451d359trac #16295: Merged with 6.3.beta2
31e84b8trac #16295: todo note
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from f1c7fa1 to 31e84b8

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago
comment:45

Okay guys... Well it has been two weeks, so if the only problem of this patch is a typo perhaps it can be dealt with ?... There are several branches depending on this one...

Nathann