sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.32k stars 453 forks source link

Restructure IntegerListLex code #18109

Closed videlec closed 8 years ago

videlec commented 9 years ago

Split up IntegerListsLex in a fast Cython back-end and a (slower) Python front-end. Restructure with multiple implementations (like #17920) in mind.

This ticket will not change anything to the public interface of IntegerListsLex.

Depends on #15525

CC: @anneschilling @jdemeyer @nthiery @nathanncohen @bgillesp

Component: combinatorics

Author: Jeroen Demeyer

Branch/Commit: 65025ab

Reviewer: Anne Schilling, Travis Scrimshaw

Issue created by migration from https://trac.sagemath.org/ticket/18109

nthiery commented 9 years ago
comment:1
6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 9 years ago
comment:2

We want to keep the parent to model the set itself, ask questions like cardinality or building the polyhedron, do constructions on top of it (e.g. use it as indexing set for a vector space), etc.

The 'Lex' there seems a bit too much for the mathematical object that you want to represent. You describe things that could be a method of an 'IntegerLists' object (or more specifically methods of 'Compositions' or 'Partitions').

  • Being able to specify an element constructor is a useful feature as well. What we need to discuss here is whether we want to switch to using lists (or tuples!) by default.

There should be a way to enumerate these objects without paying this cost, however. A way to have both is to implement the iterator to return a copy of the current list, or a tuple (or even the current list itself, with big 'read only' warnings), and then implement in IntegerListsLex an __iter__ that wraps every element returned by that iterator with .

Nathann

jdemeyer commented 9 years ago
comment:3

Replying to @nathanncohen:

There should be a way to enumerate these objects without paying this cost, however. A way to have both is to implement the iterator to return a copy of the current list, or a tuple (or even the current list itself, with big 'read only' warnings), and then implement in IntegerListsLex an __iter__ that wraps every element returned by that iterator with .

That's also what I have in mind: a low-level class designed to be clean and fast implemented in Cython without overhead. And then a class on top of that which can implement whatever extra Python features that you want.

jdemeyer commented 9 years ago

Branch: u/jdemeyer/ticket/18109

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Commit: 8d5ca55

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

8d5ca55Restructure IntegerListsLex code
jdemeyer commented 9 years ago

Author: Jeroen Demeyer

jdemeyer commented 9 years ago

Description changed:

--- 
+++ 
@@ -1,4 +1,3 @@
-There are several useless features in `IntegerListLex` as already mentioned in [ticket #17979:comment 21](https://github.com/sagemath/sage/issues/17979#comment:291):
-- it should not inherit from `Parent` and it should generate lists and not `ClonableArray`
-- `global_options` is a useless attribute
-- the `__clascall__` is here for nothing
+Split up `IntegerListsLex` in a fast Cython back-end and a (slower) Python front-end. Restructure with multiple implementations (like #17920) in mind.
+
+Attached branch is very much work in progress (although comments on the *design* are welcome).
jdemeyer commented 9 years ago

Description changed:

--- 
+++ 
@@ -1,3 +1,5 @@
 Split up `IntegerListsLex` in a fast Cython back-end and a (slower) Python front-end. Restructure with multiple implementations (like #17920) in mind.

+This ticket will not change anything to the public interface of `IntegerListsLex`.
+
 Attached branch is very much work in progress (although comments on the *design* are welcome).
videlec commented 9 years ago
comment:9

Hi Jeroen,

As a matter of fact, if you isolate the commit that just move a file, the diff looks much nicer. I am not able to get something reasonable with the options -B, -C, -M or -D. Do you know what to do?

Vincent

jdemeyer commented 9 years ago
comment:10

Replying to @videlec:

Hi Jeroen,

As a matter of fact, if you isolate the commit that just move a file, the diff looks much nicer. I am not able to get something reasonable with the options -B, -C, -M or -D. Do you know what to do?

Sorry no. I don't know how to nicely show diffs which split up a file in two (interestingly, hg has better support for this!)

jdemeyer commented 9 years ago

Dependencies: #18181

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

f648443Move IntegerListsLex._Iter out of IntegerListsLex
d8fd440Restructure IntegerListsLex code
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 8d5ca55 to d8fd440

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from d8fd440 to 9fbbd3c

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

9fbbd3cRestructure IntegerListsLex code
videlec commented 9 years ago
comment:14

A comment on the design... Should we really support +Infinity in the iterator? I would go for unsigned int variables and UINT_MAX as a synonmyous for Infinity. Of course it makes sense for the higher classes (e.g. to give a proper answer to .cardinality()).

jdemeyer commented 9 years ago
comment:15

Replying to @videlec:

A comment on the design... Should we really support +Infinity in the iterator?

Yes.

I would go for unsigned int variables and UINT_MAX as a synonmyous for Infinity.

And not support IntegerLists(10^100)?

videlec commented 9 years ago
comment:16

Replying to @jdemeyer:

Replying to @videlec:

A comment on the design... Should we really support +Infinity in the iterator?

Yes.

I would go for unsigned int variables and UINT_MAX as a synonmyous for Infinity.

And not support IntegerLists(10^100)?

I said for the iterator. Not for the main class. I would not bother if iter(IntegerLists(10^100)) just failed. It should be very fast for small entries. I guess that one option would be to use Nathann strategy in #18137 with fused Cython type (here unsigned int and mpz_t). But I remember that it was nearly impossible to make it work as attributes of an extension class.

jdemeyer commented 9 years ago
comment:17

Replying to @videlec:

But I remember that it was nearly impossible to make it work as attributes of an extension class.

For attributes of an extension class, no. I guess you could have two classes (one for some C type and one for mpz_t) on top of a common base class. But I have never done this.

jdemeyer commented 9 years ago
comment:18

Replying to @videlec:

Replying to @jdemeyer:

Replying to @videlec:

A comment on the design... Should we really support +Infinity in the iterator?

Yes.

I would go for unsigned int variables and UINT_MAX as a synonmyous for Infinity.

And not support IntegerLists(10^100)?

I said for the iterator. Not for the main class. I would not bother if iter(IntegerLists(10^100)) just failed. It should be very fast for small entries. I guess that one option would be to use Nathann strategy in #18137 with fused Cython type (here unsigned int and mpz_t).

I think that we really should support list(IntegerLists(10^100, length=1)) because in Sage, we always support large integers if possible.

In any case, changing this is certainly outside the scope of this ticket (it could be done in #18055 or #18056). Here, I just want to reorganize the code without changing the implementation.

jdemeyer commented 9 years ago

Changed dependencies from #18181 to #18181, #18184

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

e39566aRestructure IntegerListsLex code
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 9fbbd3c to e39566a

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

cda4b75Restructure IntegerListsLex code
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from e39566a to cda4b75

nthiery commented 9 years ago
comment:22

Replying to @jdemeyer:

I think that we really should support list(IntegerLists(10^100, length=1)) because in Sage, we always support large integers if possible.

That's part of why I am thinking of C++; then we can just have a templated iterator, and depending on the input we can choose one instantiation or the other.

In any case, changing this is certainly outside the scope of this ticket (it could be done in #18055 or #18056). Here, I just want to reorganize the code without changing the implementation.

Sounds reasonable indeed.


New commits:

cda4b75Restructure IntegerListsLex code
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

e03140aCombinatorialObject constructor should copy input
e7bc348Merge commit 'e03140a761586bf64f01aceba324abd1eece813b' into HEAD
7f8c40aRestructure IntegerListsLex code
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from cda4b75 to 7f8c40a

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

2b88b64Restructure IntegerListsLex code
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 7f8c40a to 2b88b64

jdemeyer commented 9 years ago
comment:25

Replying to @videlec:

I am not able to get something reasonable with the options -B, -C, -M or -D. Do you know what to do?

Actually, the following will show everything as copied from integer_list.py:

git show --patience -D -B -C01
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

e67e71cRestructure IntegerListsLex code
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 2b88b64 to e67e71c

jdemeyer commented 9 years ago
comment:27

This now passes all doctests except for the pickle jar.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from e67e71c to 8c587d3

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

8c587d3Restructure IntegerListsLex code
jdemeyer commented 9 years ago
comment:29

Passes all doctests and documentation builds.

jdemeyer commented 9 years ago

Description changed:

--- 
+++ 
@@ -1,5 +1,3 @@
 Split up `IntegerListsLex` in a fast Cython back-end and a (slower) Python front-end. Restructure with multiple implementations (like #17920) in mind.

 This ticket will not change anything to the public interface of `IntegerListsLex`.
-
-Attached branch is very much work in progress (although comments on the *design* are welcome).
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 8c587d3 to 2be6ad0

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

2be6ad0Restructure IntegerListsLex code
anneschilling commented 9 years ago
comment:32

Hi Jeroen!

Since you seem to have cythonized the code already, could you add some timings compared to the old code?

Anne

PS: I was under the impression that it is harder to debug code in cython, which might make life a little harder for the new features in #18055.

jdemeyer commented 9 years ago
comment:33

Replying to @anneschilling:

Since you seem to have cythonized the code already, could you add some timings compared to the old code?

My goal certainly was not to gain speed, but I could check...

I was under the impression that it is harder to debug code in cython, which might make life a little harder for the new features in #18055.

I have no idea really. A lot of the code I write for Sage is Cython and it hasn't bothered me.

On the other hand, I on purpose did not Cythonize IntegerListsLexIter (it's in a Cython source file, but doesn't use any Cython features). So if it makes your life easier, just move that one class to a Python file.

jdemeyer commented 9 years ago
comment:34

Replying to @anneschilling:

you seem to have cythonized the code already

Depends what you mean. I moved the files to a Cython source file and I created extension types instead of Python classes. But I didn't optimize the loops for example.

jdemeyer commented 9 years ago
comment:35

There is a small but significant speed-up (again: this is without really trying to speed up the code).


Before:

sage: timeit('list(IntegerListsLex(14, min_part=1))')
5 loops, best of 3: 1.18 s per loop

After:

sage: timeit('list(IntegerListsLex(14, min_part=1))')
5 loops, best of 3: 986 ms per loop

Before:

sage: timeit('list(IntegerListsLex(28, length=8, floor=lambda i:i, check=False))')
5 loops, best of 3: 1.11 s per loop

After:

sage: timeit('list(IntegerListsLex(28, length=8, floor=lambda i:i, check=False))')
5 loops, best of 3: 852 ms per loop
anneschilling commented 9 years ago
comment:36

Nice that there is a slight speedup without even trying hard!

FYI, I rebased this branch on top of sage-6.7.beta0 and everything looks clean.

I noticed that in nn.y in sage.combinat.integer_lists it currently says

    .. WARNING:: this function is likely to disappear in :trac:`17927`.

Briefly looking at 17927 this seems no longer the case.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 2be6ad0 to 2188082

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

2188082Remove references to #17927
anneschilling commented 9 years ago
comment:38

In principle this branch looks good to me; I guess it needs to be rebased to the latest development version, however.

Just to check: are you planning to reuse the Envelope class also for #17920? The planned changes there involving backward smoothing etc will be ok, right?

jdemeyer commented 9 years ago
comment:39

Replying to @anneschilling:

Just to check: are you planning to reuse the Envelope class also for #17920? The planned changes there involving backward smoothing etc will be ok, right?

Sure. The only risk is that errors in Envelope will make both implementations wrong.