Closed seblabbe closed 9 years ago
Note that the patchbot was not able to build the doc.
Branch pushed to git repo; I updated commit sha1. New commits:
ca84fcf | Trac #18987: Parallel computation of the nb of solutions for tilings |
3d4402b | Trac #18987: Tilings of polyominos modulo 180 degrees rotations |
65f6512 | Trac #18987: Add a transparent gray box to QuantuminoState.show3d |
555f0a9 | Trac #18987: doc + optim in dancing_links |
481d96e | Trac #18987: rename orthogonal_transformation function |
5f66e1d | Trac #18987: Improve explanation of the modpi argument |
d359115 | Trac #18987: Using cosets for when modpi=True |
1e01ba2 | Trac #18987: Change pentaminos to there canonical form |
There was a TESTS::
with text following. Fixed that (and squashed it the the last commit). Also added the prefix Trac #18987:
to all commit messages.
There is one failing doctest on the patchbot that is not failing on my machine...
Branch pushed to git repo; I updated commit sha1. New commits:
2133ecd | Trac #18987: fixing intermittent doctest failure |
Are you sure ncube_isometry_group
is worth a @cached_function
?
In ncube_isometry_group_modpi
why are you building MatrixGroup
?
In
P_coset = set(frozenset((m.matrix() * self).canonical() for m in coset) for coset in L)
return set(next(iter(s)) for s in P_coset)
You are building a lot of images to consider matrix x polyomino
to use just one at the end. Why not
return set((L[0].matrix() * self).canonical() for coset in L)
And if you were not using matrix groups you can even do
return set((L[0] * self).canonical() for coset in L)
You are using L
instead of coset
I think. Can you edit your previous comment to make sure I understand what you mean?
Replying to @videlec:
Are you sure
ncube_isometry_group
is worth a@cached_function
?
Calling this function takes 2.8s on my machine. And it is called once for each polyomino. That is 16 times for the Quantumino puzzle. With the cache, I gain about 40s to construct the rows to give to the dlx solver.
In
ncube_isometry_group_modpi
why are you buildingMatrixGroup
?
Because otherwise, this
G = ncube_isometry_group(n, orientation_preserving)
H = [h for h in G if all(i==j for (i,j) in h.nonzero_positions())]
left_cosets = set(tuple(sorted(h*g for h in H)) for g in G)
throws a TypeError: mutable matrices are unhashable
and I find it more fun to read like this instead of the .set_immutable()
on every matrices.
You are building a lot of images to consider
matrix x polyomino
to use just one at the end. Why not
I know but I need all of them. That is what I understood yesterday. Let me try to explain. In general it is okay to take only one matrix in the coset. The problem comes when the polyomino is invariant under some of the 24 orientation preserving isometries of the cube. For example, consider:
sage: from sage.games.quantumino import pentaminos
sage: from sage.combinat.tiling import ncube_isometry_group
sage: p = pentaminos[7]
sage: m = ncube_isometry_group(3)[-3]
sage: p
Polyomino: [(0, 0, 0), (0, 1, 0), (0, 2, 0), (0, 2, 1), (1, 0, 0)], Color: orange
sage: m
[ 0 0 1]
[ 0 -1 0]
[ 1 0 0]
sage: (m*p).canonical() == p
True
The polyomino p
has 12 distinct rotation images instead of 24. And among the 12 ways of placing that polyomino into a box, there are 3 distinct ways up to rotation of the box keeping the box invariant. To compute this, we need to consider the whole coset:
sage: cosets = ncube_isometry_group_modpi(3)
sage: P_coset = set(frozenset((m.matrix() * p).canonical() for m in coset) for coset in cosets)
sage: len(P_coset)
3
sage: len(set(next(iter(s)) for s in P_coset))
3
Otherwise, you obtain too many polyominos (see below).
sage: set((coset[0].matrix() * p).canonical() for coset in cosets)
{Polyomino: [(0, 0, 1), (1, 0, 1), (1, 1, 1), (1, 2, 0), (1, 2, 1)], Color: orange,
Polyomino: [(0, 1, 2), (1, 0, 0), (1, 1, 0), (1, 1, 1), (1, 1, 2)], Color: orange,
Polyomino: [(0, 0, 0), (0, 1, 0), (1, 1, 0), (2, 1, 0), (2, 1, 1)], Color: orange,
Polyomino: [(0, 1, 0), (0, 1, 1), (1, 1, 1), (2, 0, 1), (2, 1, 1)], Color: orange,
Polyomino: [(0, 2, 0), (1, 0, 0), (1, 0, 1), (1, 1, 0), (1, 2, 0)], Color: orange}
Replying to @seblabbe:
Replying to @videlec:
Are you sure
ncube_isometry_group
is worth a@cached_function
?Calling this function takes 2.8s on my machine. And it is called once for each polyomino. That is 16 times for the Quantumino puzzle. With the cache, I gain about 40s to construct the rows to give to the dlx solver.
Perhaps the way it is implemented is wrong. For example
sage: %timeit P = [s.matrix() for s in SymmetricGroup(4)]
1 loops, best of 3: 1.27 ms per loop
and with the signs
sage: from itertools import product
sage: %timeit M = [diagonal_matrix(p) for p in product([1,-1], repeat=4)]
1 loops, best of 3: 2.9 ms per loop
sage: M = [diagonal_matrix(p) for p in product([1,-1], repeat=4)]
sage: %timeit S = [m*s.matrix() for m in M for s in SymmetricGroup(4)]
100 loops, best of 3: 18.3 ms per loop
So it is likely to be ~20ms
.
Branch pushed to git repo; I updated commit sha1. New commits:
0d68ece | Trac #18987: removed a cached_function |
Replying to @videlec:
Are you sure
ncube_isometry_group
is worth a@cached_function
?
In fact, you make me realize that there is already caching involved in WeylGroup. Therefore caching that function is not so necessary. Without caching, I get:
sage: from sage.combinat.tiling import ncube_isometry_group
sage: time L = ncube_isometry_group(4)
CPU times: user 1.14 s, sys: 19.7 ms, total: 1.16 s
Wall time: 1.3 s
sage: time L = ncube_isometry_group(4)
CPU times: user 358 ms, sys: 4.01 ms, total: 362 ms
Wall time: 448 ms
sage: from itertools import product sage: %timeit M = [diagonal_matrix(p) for p in product([1,-1], repeat=4)] 1 loops, best of 3: 2.9 ms per loop sage: M = [diagonal_matrix(p) for p in product([1,-1], repeat=4)] sage: %timeit S = [m*s.matrix() for m in M for s in SymmetricGroup(4)] 100 loops, best of 3: 18.3 ms per loop
So it is likely to be
~20ms
.
Note that you can't use timeit above since there is caching involved in SymmetricGroup.
For comparison, your solution on my machine gives the following timings:
sage: from itertools import product
sage: # first call
sage: time M = [diagonal_matrix(p) for p in product([1,-1], repeat=4)]
CPU times: user 11.4 ms, sys: 92 µs, total: 11.5 ms
Wall time: 11.9 ms
sage: time S = [m*s.matrix() for m in M for s in SymmetricGroup(4)]
CPU times: user 191 ms, sys: 25.2 ms, total: 216 ms
Wall time: 667 ms
sage: # second call
sage: time M = [diagonal_matrix(p) for p in product([1,-1], repeat=4)]
CPU times: user 11.7 ms, sys: 1.34 ms, total: 13 ms
Wall time: 16.6 ms
sage: time S = [m*s.matrix() for m in M for s in SymmetricGroup(4)]
CPU times: user 108 ms, sys: 1.92 ms, total: 110 ms
Wall time: 114 ms
So it seems using WeylGroup(['B',4])
is about 4 times slower than your solution. But I don't know if I will change it. I mean that improvement could be done directly in WeylGroup
code...
Replying to @seblabbe:
sage: from itertools import product sage: %timeit M = [diagonal_matrix(p) for p in product([1,-1], repeat=4)] 1 loops, best of 3: 2.9 ms per loop sage: M = [diagonal_matrix(p) for p in product([1,-1], repeat=4)] sage: %timeit S = [m*s.matrix() for m in M for s in SymmetricGroup(4)] 100 loops, best of 3: 18.3 ms per loop
So it is likely to be
~20ms
.Note that you can't use timeit above since there is caching involved in SymmetricGroup.
Where?! Beyond the construction of the group (SymmetricGroup(4) is SymmetricGroup(4)
gives True
) nothing is cached.
Where?! Beyond the construction of the group (
SymmetricGroup(4) is SymmetricGroup(4)
givesTrue
) nothing is cached.
Don't you get that the first execution of
sage: time S = [m*s.matrix() for m in M for s in SymmetricGroup(4)]
is slower than the second? And that the second takes about the same time than the third, the fourth, etc. ? Like me:
sage: from itertools import product
sage: M = [diagonal_matrix(p) for p in product([1,-1], repeat=4)]
sage: time S = [m*s.matrix() for m in M for s in SymmetricGroup(4)]
CPU times: user 184 ms, sys: 18 ms, total: 202 ms
Wall time: 710 ms
sage: time S = [m*s.matrix() for m in M for s in SymmetricGroup(4)]
CPU times: user 109 ms, sys: 2.05 ms, total: 111 ms
Wall time: 166 ms
sage: time S = [m*s.matrix() for m in M for s in SymmetricGroup(4)]
CPU times: user 109 ms, sys: 2.6 ms, total: 111 ms
Wall time: 113 ms
sage: time S = [m*s.matrix() for m in M for s in SymmetricGroup(4)]
CPU times: user 110 ms, sys: 2.12 ms, total: 112 ms
Wall time: 169 ms
sage: time S = [m*s.matrix() for m in M for s in SymmetricGroup(4)]
CPU times: user 108 ms, sys: 2.44 ms, total: 111 ms
Wall time: 114 ms
Yes for the timing. But this does not implies that something is cached. Actually, the reason is because gap is launched (I do not know why)
sage: S = SymmetricGroup(4)
sage: time gap(3)
CPU times: user 16 ms, sys: 12 ms, total: 28 ms
Wall time: 295 ms
3
sage: from itertools import product
sage: M = [diagonal_matrix(p) for p in product([1,-1], repeat=4)]
sage: time S = [m*s.matrix() for m in M for s in SymmetricGroup(4)]
CPU times: user 56 ms, sys: 0 ns, total: 56 ms
Wall time: 56.9 ms
sage: time S = [m*s.matrix() for m in M for s in SymmetricGroup(4)]
CPU times: user 56 ms, sys: 0 ns, total: 56 ms
Wall time: 53 ms
caching ncube_isometry_group
gives:
sage: from sage.games.quantumino import QuantuminoSolver
sage: q = QuantuminoSolver(0)
sage: t = q.tiling_solver()
sage: time rows = t.rows()
CPU times: user 18.8 s, sys: 168 ms, total: 19 s
Wall time: 19 s
not caching ncube_isometry_group
gives:
sage: from sage.games.quantumino import QuantuminoSolver
sage: q = QuantuminoSolver(0)
sage: t = q.tiling_solver()
sage: time rows = t.rows()
CPU times: user 19.4 s, sys: 288 ms, total: 19.7 s
Wall time: 20 s
So I confirm that I don't lose relatively much time by not caching.
Replying to @videlec:
Perhaps the way it is implemented is wrong. For example
sage: %timeit P = [s.matrix() for s in SymmetricGroup(4)] 1 loops, best of 3: 1.27 ms per loop
I removed the cached_method
. Do you want me to replace the code based on WeylGroup(['B',n])
by the SymmetricGroup
+ diagonal_matrix of signs
code you propose? I think that this improvement should be done in WeylGroup
in another ticket.
Also, for timing improvements in that file tiling.py
, there is a another more important thing to spend time on because a lot of time is spent creating vectors from tuple (because coordinates are stored as tuple). This should be done in another ticket.
Needs review!
Replying to @seblabbe:
Replying to @videlec:
Perhaps the way it is implemented is wrong. For example
sage: %timeit P = [s.matrix() for s in SymmetricGroup(4)] 1 loops, best of 3: 1.27 ms per loop
I removed the
cached_method
. Do you want me to replace the code based onWeylGroup(['B',n])
by theSymmetricGroup
+diagonal_matrix of signs
code you propose? I think that this improvement should be done inWeylGroup
in another ticket.
Nope. I also the think that the WeylGroup
code has to be improved.
Also, for timing improvements in that file
tiling.py
, there is a another more important thing to spend time on because a lot of time is spent creating vectors from tuple (because coordinates are stored as tuple). This should be done in another ticket.
Did you notice that
sage: t = map(vector, t)
is much slower than
sage: V = FreeModule(ZZ,12)
sage: t = map(V,t)
(I measure a factor x8 in __sub__
and __add__
for example)
Moreover, everything would be faster if you would store integer vectors instead of tuples (and an attribute self._free_module
). Why aren't you doing that?
I still do not understand why you are parallelizing the polyomino solver code and not the DLX one. Isn't your strategy exactly equivalent to this one: pick a column, for each row that does have a 1 in this column remove all the columns occuppied by the piece and launch an independent process?
Moreover, everything would be faster if you would store integer vectors instead of tuples (and an attribute
self._free_module
). Why aren't you doing that?
Because I think I wanted to use the most basic hashable container (tuple) for the need. I realized only recently that it was slow because there are many matrix operations and additions involved. So indeed, storing vectors would be better.
Because it is not the purpose of this ticket.
Replying to @videlec:
I still do not understand why you are parallelizing the polyomino solver code and not the DLX one. Isn't your strategy exactly equivalent to this one: pick a column, for each row that does have a 1 in this column remove all the columns occuppied by the piece and launch an independent process?
Okay, I will work on a new commit for that.
Replying to @seblabbe:
Moreover, everything would be faster if you would store integer vectors instead of tuples (and an attribute
self._free_module
). Why aren't you doing that?
Because I think I wanted to use the most basic hashable container (tuple) for the need. I realized only recently that it was slow because there are many matrix operations and additions involved. So indeed, storing vectors would be better.
Because it is not the purpose of this ticket.
I opened #19036
Branch pushed to git repo; I updated commit sha1. New commits:
df65587 | Trac #18987: Moved the paralel computation of nb of sol in dancing links module |
(I will not have access to a machine with Sage for the next 7 days).
Ok, Vincent. I am now back and available to do the follow up on this ticket.
With the new proposal you made, this ticket now consist of two independant things. Do you prefer me to split the code into two tickets (easier to review) or is it ok like this?
Sébastien
I finally decided to split this ticket into two to ease the review. The second part is now available at #19107.
Description changed:
---
+++
@@ -2,7 +2,7 @@
sage: from sage.games.quantumino import QuantuminoSolver -sage: QuantuminoSolver(0).number_of_solutions() # long time (about 30 days) +sage: QuantuminoSolver(0).number_of_solutions() # long time (several days)
but we can make it faster by doing the computation in parallel... This ticket does this.
Description changed:
---
+++
@@ -5,6 +5,4 @@
sage: QuantuminoSolver(0).number_of_solutions() # long time (several days)
-It also simplify the computation by avoiding to compute 4 times the same solution (obtained by rotation of angle pi of the 5x8x2 box). +but we can make it faster by doing the computation in parallel... This ticket does this (directly in the dancing links code).
(solving) independant (cases)
-> independent
If possible, a good choice of column gives a partition
of solutions where each part has about the same number
of solutions.
OUTPUT
section instead of dict, row number -> list of rows
After the split each subproblem has the same number of columns and
rows and the same solutions as above::
it is the union of solutions of the subproblems that form the solution of the initial one.
number_of_solutions_iterator
and number_of_solutions
. You should just get rid of the first one and allow parallelization in the second.Branch pushed to git repo; I updated commit sha1. New commits:
29ff986 | Trac #18987: Fixed reviewer comment 65 |
Thanks for your review! I fixed the first comments.
- I do not see the point of having
number_of_solutions_iterator
andnumber_of_solutions
. You should just get rid of the first one and allow parallelization in the second.
number_of_solutions_iterator
is now _number_of_solutions_iterator
number_of_solutions
now allows parallel computation_number_of_solutions_iterator
because for the problem I am currently looking at, number_of_solutions
takes days while _number_of_solutions_iterator
yield something once every hours. So it allows me to follow the computation, making sure it is not stuck and evaluate the duration left to do. I prefer this way rather than removing method _number_of_solutions_iterator
and adding some verbose thing in number_of_solutions
.
- Why not also make available parallelization for getting the list of solutions?
Because I don't know how or if it is possible to use the parallel decorator to merge iterators.
Salut Sebastien,
Thanks for your patience. I am sure that the choice of splitting is suboptimal but at least the design is much cleaner than in the first version.
Vincent
Changed branch from public/18987 to 29ff986
The following computation takes a lot of time:
but we can make it faster by doing the computation in parallel... This ticket does this (directly in the dancing links code).
Component: combinatorics
Author: Sébastien Labbé
Branch/Commit:
29ff986
Reviewer: Vincent Delecroix
Issue created by migration from https://trac.sagemath.org/ticket/18987