Closed videlec closed 8 years ago
I believe we should rephrase this ticket as "extract the
IntegerListsLex
iterator as a standalone tool that depends on
nothing but Python
/Cython
". In fact this could go as far as making
it a standalone library in e.g. C++.
We want to keep the parent to model the set itself, ask questions like cardinality or building the polyhedron, do constructions on top of it (e.g. use it as indexing set for a vector space), etc.
Being able to specify an element constructor is a useful feature as well. What we need to discuss here is whether we want to switch to using lists (or tuples!) by default.
To remove __classcall__
we need to wait until the end of the
deprecation period. To remove global_options
we need to wait for
the subclasses using it to be refactored to not impose this burden
on IntegerListsLex
.
We want to keep the parent to model the set itself, ask questions like cardinality or building the polyhedron, do constructions on top of it (e.g. use it as indexing set for a vector space), etc.
The 'Lex' there seems a bit too much for the mathematical object that you want to represent. You describe things that could be a method of an 'IntegerLists' object (or more specifically methods of 'Compositions' or 'Partitions').
- Being able to specify an element constructor is a useful feature as well. What we need to discuss here is whether we want to switch to using lists (or tuples!) by default.
There should be a way to enumerate these objects without paying this cost, however. A way to have both is to implement the iterator to return a copy of the current list, or a tuple (or even the current list itself, with big 'read only' warnings), and then implement in IntegerListsLex
an __iter__
that wraps every element returned by that iterator with
Nathann
Replying to @nathanncohen:
There should be a way to enumerate these objects without paying this cost, however. A way to have both is to implement the iterator to return a copy of the current list, or a tuple (or even the current list itself, with big 'read only' warnings), and then implement in
IntegerListsLex
an__iter__
that wraps every element returned by that iterator with.
That's also what I have in mind: a low-level class designed to be clean and fast implemented in Cython without overhead. And then a class on top of that which can implement whatever extra Python features that you want.
Branch: u/jdemeyer/ticket/18109
Commit: 8d5ca55
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
8d5ca55 | Restructure IntegerListsLex code |
Author: Jeroen Demeyer
Description changed:
---
+++
@@ -1,4 +1,3 @@
-There are several useless features in `IntegerListLex` as already mentioned in [ticket #17979:comment 21](https://github.com/sagemath/sage/issues/17979#comment:291):
-- it should not inherit from `Parent` and it should generate lists and not `ClonableArray`
-- `global_options` is a useless attribute
-- the `__clascall__` is here for nothing
+Split up `IntegerListsLex` in a fast Cython back-end and a (slower) Python front-end. Restructure with multiple implementations (like #17920) in mind.
+
+Attached branch is very much work in progress (although comments on the *design* are welcome).
Description changed:
---
+++
@@ -1,3 +1,5 @@
Split up `IntegerListsLex` in a fast Cython back-end and a (slower) Python front-end. Restructure with multiple implementations (like #17920) in mind.
+This ticket will not change anything to the public interface of `IntegerListsLex`.
+
Attached branch is very much work in progress (although comments on the *design* are welcome).
Hi Jeroen,
As a matter of fact, if you isolate the commit that just move a file, the diff looks much nicer. I am not able to get something reasonable with the options -B
, -C
, -M
or -D
. Do you know what to do?
Vincent
Replying to @videlec:
Hi Jeroen,
As a matter of fact, if you isolate the commit that just move a file, the diff looks much nicer. I am not able to get something reasonable with the options
-B
,-C
,-M
or-D
. Do you know what to do?
Sorry no. I don't know how to nicely show diffs which split up a file in two (interestingly, hg
has better support for this!)
Dependencies: #18181
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
9fbbd3c | Restructure IntegerListsLex code |
A comment on the design... Should we really support +Infinity in the iterator? I would go for unsigned int
variables and UINT_MAX
as a synonmyous for Infinity. Of course it makes sense for the higher classes (e.g. to give a proper answer to .cardinality()
).
Replying to @videlec:
A comment on the design... Should we really support +Infinity in the iterator?
Yes.
I would go for
unsigned int
variables andUINT_MAX
as a synonmyous for Infinity.
And not support IntegerLists(10^100)
?
Replying to @jdemeyer:
Replying to @videlec:
A comment on the design... Should we really support +Infinity in the iterator?
Yes.
I would go for
unsigned int
variables andUINT_MAX
as a synonmyous for Infinity.And not support
IntegerLists(10^100)
?
I said for the iterator. Not for the main class. I would not bother if iter(IntegerLists(10^100))
just failed. It should be very fast for small entries. I guess that one option would be to use Nathann strategy in #18137 with fused Cython type (here unsigned int
and mpz_t
). But I remember that it was nearly impossible to make it work as attributes of an extension class.
Replying to @videlec:
But I remember that it was nearly impossible to make it work as attributes of an extension class.
For attributes of an extension class, no. I guess you could have two classes (one for some C type and one for mpz_t
) on top of a common base class. But I have never done this.
Replying to @videlec:
Replying to @jdemeyer:
Replying to @videlec:
A comment on the design... Should we really support +Infinity in the iterator?
Yes.
I would go for
unsigned int
variables andUINT_MAX
as a synonmyous for Infinity.And not support
IntegerLists(10^100)
?I said for the iterator. Not for the main class. I would not bother if
iter(IntegerLists(10^100))
just failed. It should be very fast for small entries. I guess that one option would be to use Nathann strategy in #18137 with fused Cython type (hereunsigned int
andmpz_t
).
I think that we really should support list(IntegerLists(10^100, length=1))
because in Sage, we always support large integers if possible.
In any case, changing this is certainly outside the scope of this ticket (it could be done in #18055 or #18056). Here, I just want to reorganize the code without changing the implementation.
Changed dependencies from #18181 to #18181, #18184
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
e39566a | Restructure IntegerListsLex code |
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
cda4b75 | Restructure IntegerListsLex code |
Replying to @jdemeyer:
I think that we really should support
list(IntegerLists(10^100, length=1))
because in Sage, we always support large integers if possible.
That's part of why I am thinking of C++; then we can just have a templated iterator, and depending on the input we can choose one instantiation or the other.
In any case, changing this is certainly outside the scope of this ticket (it could be done in #18055 or #18056). Here, I just want to reorganize the code without changing the implementation.
Sounds reasonable indeed.
New commits:
cda4b75 | Restructure IntegerListsLex code |
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
2b88b64 | Restructure IntegerListsLex code |
Replying to @videlec:
I am not able to get something reasonable with the options
-B
,-C
,-M
or-D
. Do you know what to do?
Actually, the following will show everything as copied from integer_list.py
:
git show --patience -D -B -C01
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
e67e71c | Restructure IntegerListsLex code |
This now passes all doctests except for the pickle jar.
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
8c587d3 | Restructure IntegerListsLex code |
Passes all doctests and documentation builds.
Description changed:
---
+++
@@ -1,5 +1,3 @@
Split up `IntegerListsLex` in a fast Cython back-end and a (slower) Python front-end. Restructure with multiple implementations (like #17920) in mind.
This ticket will not change anything to the public interface of `IntegerListsLex`.
-
-Attached branch is very much work in progress (although comments on the *design* are welcome).
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
2be6ad0 | Restructure IntegerListsLex code |
Hi Jeroen!
Since you seem to have cythonized the code already, could you add some timings compared to the old code?
Anne
PS: I was under the impression that it is harder to debug code in cython, which might make life a little harder for the new features in #18055.
Replying to @anneschilling:
Since you seem to have cythonized the code already, could you add some timings compared to the old code?
My goal certainly was not to gain speed, but I could check...
I was under the impression that it is harder to debug code in cython, which might make life a little harder for the new features in #18055.
I have no idea really. A lot of the code I write for Sage is Cython and it hasn't bothered me.
On the other hand, I on purpose did not Cythonize IntegerListsLexIter
(it's in a Cython source file, but doesn't use any Cython features). So if it makes your life easier, just move that one class to a Python file.
Replying to @anneschilling:
you seem to have cythonized the code already
Depends what you mean. I moved the files to a Cython source file and I created extension types instead of Python classes. But I didn't optimize the loops for example.
There is a small but significant speed-up (again: this is without really trying to speed up the code).
Before:
sage: timeit('list(IntegerListsLex(14, min_part=1))')
5 loops, best of 3: 1.18 s per loop
After:
sage: timeit('list(IntegerListsLex(14, min_part=1))')
5 loops, best of 3: 986 ms per loop
Before:
sage: timeit('list(IntegerListsLex(28, length=8, floor=lambda i:i, check=False))')
5 loops, best of 3: 1.11 s per loop
After:
sage: timeit('list(IntegerListsLex(28, length=8, floor=lambda i:i, check=False))')
5 loops, best of 3: 852 ms per loop
Nice that there is a slight speedup without even trying hard!
FYI, I rebased this branch on top of sage-6.7.beta0 and everything looks clean.
I noticed that in nn.y
in sage.combinat.integer_lists
it currently says
.. WARNING:: this function is likely to disappear in :trac:`17927`.
Briefly looking at 17927
this seems no longer the case.
Branch pushed to git repo; I updated commit sha1. New commits:
2188082 | Remove references to #17927 |
In principle this branch looks good to me; I guess it needs to be rebased to the latest development version, however.
Just to check: are you planning to reuse the Envelope class also for #17920? The planned changes there involving backward smoothing etc will be ok, right?
Replying to @anneschilling:
Just to check: are you planning to reuse the Envelope class also for #17920? The planned changes there involving backward smoothing etc will be ok, right?
Sure. The only risk is that errors in Envelope
will make both implementations wrong.
Split up
IntegerListsLex
in a fast Cython back-end and a (slower) Python front-end. Restructure with multiple implementations (like #17920) in mind.This ticket will not change anything to the public interface of
IntegerListsLex
.Depends on #15525
CC: @anneschilling @jdemeyer @nthiery @nathanncohen @bgillesp
Component: combinatorics
Author: Jeroen Demeyer
Branch/Commit:
65025ab
Reviewer: Anne Schilling, Travis Scrimshaw
Issue created by migration from https://trac.sagemath.org/ticket/18109