Closed hivert closed 12 years ago
Before
sage: %timeit ZZ in Rings()
625 loops, best of 3: 1.13 µs per loop
After the preliminary patch
sage: %timeit ZZ in Rings()
625 loops, best of 3: 448 ns per loops
I'm finishing the patch, I've added a large optimization for the other branch
where there is no __classcall__
Here are the results in the basis case (no __classcall__
)::
sage: class Rien(object):
... pass
sage: from sage.misc.classcall_metaclass import ClasscallMetaclass
sage: class NOCALL(object):
... __metaclass__ = ClasscallMetaclass
... pass
Standard Python class:
sage: %timeit [Rien() for i in range(10000)]
125 loops, best of 3: 1.7 ms per loop
Currently:
sage: %timeit [NOCALL() for i in range(10000)]
25 loops, best of 3: 15.9 ms per loop
Cython version:
sage: %timeit [NOCALL() for i in range(10000)]
125 loops, best of 3: 3.59 ms per loop
Cython optimized version:
sage: %timeit [NOCALL() for i in range(10000)]
125 loops, best of 3: 1.76 ms per loop
So I'm 5% only slower than normal class.
Here are the results in the basis case (no __classcall__
)::
sage: class CALL(object):
... __metaclass__ = ClasscallMetaclass
... @staticmethod
... def __classcall_private__(cls, arg):
... arg = arg + arg
... return arg
Currently:
sage: %timeit [CALL(i) for i in range(10000)]
25 loops, best of 3: 8.7 ms per loop
Cython version:
sage: %timeit [CALL(i) for i in range(10000)]
125 loops, best of 3: 4.33 ms per loop
Cython optimized version (no mesurable difference from the preceding):
sage: %timeit [CALL(i) for i in range(10000)]
125 loops, best of 3: 4.34 ms per loop
Here I'm twice as fast as before.
I'm cleanup the patch (doctests) an will post it shortly.
Sorry, I can not do any serious work (reviewing), as I am in the middle of holidays. But it looks promising!
Hi Florent!
I know the patch is preliminary, but here are two comments:
In the new file classcall_metaclass_cy.pyx, it is not explained what __classcall__
and __classcall_private__
do, and when to implement which. Actually, I have never used __classcall_private__
myself.
In the old python version of classcall_metaclass, you move __call__
to __call__disable
. Why do you not completely remove the slow Python call method?
Also, why not going all the way and remove the python version completely? Or at least: Why not cythonizing (and caching?) the __contains__
method?
Or ideally, one could try to provide a generalised metaclass framework in Sage. Namely, one could think of having different small metaclasses, each providing a particular feature. For example:
NestedMetaclass
.__classcall__
(perhaps resulting in a unique parent structure or in a dynamic class).And if we have different "small" metaclasses then it would be useful to combine them. But that's a problem for Python: If A, B and C have different metaclasses, you can not simply have a class definition like
class D(A,B,C): pass
This is (I guess) why ClasscallMetaclass
inherits from NestedMetaclass
. But Ticket #11794 provides examples for a different and more skalable solution: Make metaclasses dynamic!
So, what I mean by a generalised metaclass framework in Sage is:
SageMetaclass
, from which all metaclasses (such as NestedMetaclass
and ClasscallMetaclass
and FastHashMetaclass
) are derived.SageMetaclass
) can be used as base classes for a new class. In that way, the features provided by metaclasses could be freely combined.But I guess that's a topic for a different ticket...
PS: If that works, one could also get rid of the DynamicClasscallMetaclass
, which seems like a hack that became necessary since one can not freely combine base classes with different metaclasses.
Hi Simon,
Thanks for all those remarks and sorry for not having finished this one as fast as I wanted.
Replying to @simon-king-jena:
Also, why not going all the way and remove the python version completely? Or at least: Why not cythonizing (and caching?) the
__contains__
method?
Obviously. Please see my patch on sage combinat's queue which already adresses all those remarks. I'm currently doing intensive timing to be sure that we have the fastest way.
Florent
Replying to @simon-king-jena:
This is (I guess) why
ClasscallMetaclass
inherits fromNestedMetaclass
. But Ticket #11794 provides examples for a different and more skalable solution: Make metaclasses dynamic!
Yes it is. We discussed last week with Nicolas of switching the inheritance
order here. That is using only NestedMetaclass
for Categories. I still have
to check that this doesn't break pickling of Parent with a nested Element.
So, what I mean by a generalised metaclass framework in Sage is:
- Implement a base class
SageMetaclass
, from which all metaclasses (such asNestedMetaclass
andClasscallMetaclass
andFastHashMetaclass
) are derived.- Make it so that if A, B, C are classes with different metaclasses (all derived from
SageMetaclass
) can be used as base classes for a new class. In that way, the features provided by metaclasses could be freely combined.But I guess that's a topic for a different ticket...
+1 for this and for keeping it for a different ticket.
Hi Simon,
I just uploaded an updated patch. I still have to lauch the tests (but I've to run for a train now) and to check the documentation. The code together with the timings should hopefully be the final version here. I'll put the needs review after the tests and doc rereading.
Thanks for your remarks.
Florent
Could it be that the patch was created by something like "hg remove sage/misc/classcall_metaclass.py" followed by "hg add sage/misc/classcall_metaclass.pyx"?
I am no mercurial expert, but I was told that one should better use "hg rename sage/misc/classcall_metaclass.py sage/misc/classcall_metaclass.pyx".
While we are at it: Shouldn't we also try whether changing unique_representation.py into unique_representation.pyx yields a speedup (but not changing "class" into "cdef class", I guess that won't work)?
I have created an alternative patch (using hg rename). It is smaller than the original patch, but should result in identical code. So, either of the patches could be used, but I guess the smaller patch is easier to read (because one sees more easily the additional differences between classcall_metaclass.py and classcall_metaclass.pyx)
Apply trac_12808-classcall_speedup2.patch
Description changed:
---
+++
@@ -1 +1,5 @@
When a class `C` is an instance of `ClasscallMetaclass`, then for any constructor call, the system currently looks for `__classcall__` and `__classcall_private__` in `C`. This adds quite an overhead and this could be cached, assuming no one modifies `C` (which seems reasonable). Improving this speeds up, in particular, all call to the constructor of a subclass of `UniqueRepresentation`, eg. many parents, all categories...
+
+__Apply__
+
+[attachment: trac_12808-classcall_speedup2.patch](https://github.com/sagemath/sage/files/ticket12808/trac_12808-classcall_speedup2.patch.gz)
Good news: All doctests pass.
I am astonished that the C-level way of calling type.__call__
is so much faster. That suggests to cythonise sage.structure.dynamic_class as well, so that dynamic classes are created more quickly! Shall this be done on a different ticket?
Probably the dynamic class cythonisation should be on a different ticket. However, it seems that the change to cython will not be so difficult: I simply renamed dynamic_class.py into dynamic_class.pyx (and added two lines to module_list.py), and the test suite seems to pass (most of it is already finished). And then, one can apply your trick with type.__call__
.
I see that you added _included_private_doc_
. Could you elaborate on it? I just searched the sources for that name, but it does not occur anywhere except in your patch. So, what does it do?
Replying to @simon-king-jena:
Probably the dynamic class cythonisation should be on a different ticket. However, it seems that the change to cython will not be so difficult: I simply renamed dynamic_class.py into dynamic_class.pyx (and added two lines to module_list.py), and the test suite seems to pass (most of it is already finished). And then, one can apply your trick with
type.__call__
.
Thanks for trying this out! It's cool that it works.
Now, do you mind postponing to a later ticket after #11935? I am doing a couple small changes to dynamic_class in my upcoming reviewer's patch (hopefuly tomorrow).
Note that there is a conflict with #12215, so, either of the patches needs to be rebased. But I guess we first see what patch will be reviewed first...
Some further questions (in addition to my question about the rôle of _included_private_doc_
):
Why is __all__ = ['ClasscallType', 'ClasscallMetaclass', 'typecall', 'timeCall']
added? Isn't importing also possible without that? Or is it needed to make from sage.misc.classcall_metaclass import *
work?
Why is there a two-step cythonisation? I mean, why is there a cdef class ClasscallType
implementing the special methods, and then ClasscallMetaclass
defined by double inheritance from ClasscallType
and NestedClassMetaclass
? Wouldn't it be better/easier/faster to cdef NestedClassMetaclass
as well, and then cdef ClasscallMetaclass
directly, without having the ClasscallType
?
I guess the typecall function will be useful in other modules (e.g., when cythonising dynamic_class). But it only is def. Shouldn't it rather be cdef inline
?
Concerning the timing tools provided in the module: Is there still no "central" location for all aspects of timing? I thought there were occasional discussions on sage-devel about a benchmark/timing framework.
Hi Simon,
Thanks for your in depth review.
Replying to @simon-king-jena:
Some further questions (in addition to my question about the rôle of
_included_private_doc_
):
This is a tentative inclusion of the doc of the special methods (see this thread on sage devel.
Why is
__all__ = ['ClasscallType', 'ClasscallMetaclass', 'typecall', 'timeCall']
added? Isn't importing also possible without that? Or is it needed to makefrom sage.misc.classcall_metaclass import *
work?
I just want to hide the few class I wrote here for timing (CRef, C2, C3, C2C). With this line, they are neither imported with import *, nor documented.
Why is there a two-step cythonisation? I mean, why is there a cdef class
ClasscallType
implementing the special methods, and thenClasscallMetaclass
defined by double inheritance fromClasscallType
andNestedClassMetaclass
? Wouldn't it be better/easier/faster to cdefNestedClassMetaclass
as well, and then cdefClasscallMetaclass
directly, without having theClasscallType
?
There is this discussion about cleaning up the way metaclass are defined and used in Sage. I just wanted to keep features separate. Should we decide to merge or rewrite the metaclass architecture, I'd rather to keep this for a different ticket.
I guess the typecall function will be useful in other modules (e.g., when cythonising dynamic_class). But it only is def. Shouldn't it rather be
cdef inline
?
You mean cpdef inline
(we may want to call it from Python) ? I had the
impression from C that this inline doesn't do anything if the function is
defined in a different module. More precisely, in C/C++ inline only work if
the function is defined in .h
instead of .c
. Does Cython do
anything for that ?
Concerning the timing tools provided in the module: Is there still no "central" location for all aspects of timing? I thought there were occasional discussions on sage-devel about a benchmark/timing framework.
Not that I know of.
Florent
I forgot: thanks for creating a new patch. I didn't use rename because at some point of my experiment, I was having the two files *.py
and *_c.pyx
with the *.py
importing the other. Then I copied the documentation from *.py
to *_c.pyx
until the *.py
becomes nearly empty. At the end, I renamed the *_c.pyx
into the *.pyx
and removed the *.py
.
How did you manage to get a patch using rename at the end ?
Florent
Replying to @hivert:
Why is
__all__ = ['ClasscallType', 'ClasscallMetaclass', 'typecall', 'timeCall']
added? Isn't importing also possible without that? Or is it needed to makefrom sage.misc.classcall_metaclass import *
work?I just want to hide the few class I wrote here for timing (CRef, C2, C3, C2C). With this line, they are neither imported with import *, nor documented.
So, it is not for making something importable, but for excluding something from automatic import? Cool!
Why is there a two-step cythonisation?
There is this discussion about cleaning up the way metaclass are defined and used in Sage.
You mean the sage-combinat-devel thread I started?
I just wanted to keep features separate.
Well, I actually think cythoning NestedClassMetaclass
is less intrusive than your patch.
Compare: You have to add a new cdef class ClasscallType
and change the inheritance of ClasscallMetaclass
. I suggest to change NestedClassMetaclass
into a cdef class, keep the inheritance of ClasscallMetaclass
, and avoid the addition of ClasscallType
.
I think I will test it...
You mean
cpdef inline
(we may want to call it from Python) ?
Yes, I forgot the letter "p".
I had the impression from C that this inline doesn't do anything if the function is defined in a different module. More precisely, in C/C++ inline only work if the function is defined in
.h
instead of.c
. Does Cython do anything for that ?
I expected it to be inlined, if it is defined cpdef inline in the pxd file, and then cimported into another cython file. But I don't know if that is really done, actually.
Replying to @hivert:
How did you manage to get a patch using rename at the end ?
Manually. Namely: Apply your patch, copy the touched/new files to a temporary directory, remove your patch, hg qnew
, then hg rename
, then move the copy of your file over to the file that has just been created by hg rename
, and then hg qrefresh
.
Replying to @simon-king-jena:
[...]
So, it is not for making something importable, but for excluding something from automatic import? Cool!
Yep ! see importing * from a package
Why is there a two-step cythonisation?
There is this discussion about cleaning up the way metaclass are defined and used in Sage.
You mean the sage-combinat-devel thread I started?
Yes ! An some face to face discussion we had with Nicolas.
I just wanted to keep features separate.
Well, I actually think cythoning
NestedClassMetaclass
is less intrusive than your patch.Compare: You have to add a new cdef class
ClasscallType
and change the inheritance ofClasscallMetaclass
. I suggest to changeNestedClassMetaclass
into a cdef class, keep the inheritance ofClasscallMetaclass
, and avoid the addition ofClasscallType
.I think I will test it...
Except that at some point Nicolas suggested to keep NestedClassMetaclass for Categories... The truth is that I've currently no idea on the good way metaclass should be organized in Sage. So I tried to avoid any interface changes. I was even surprised that the Cythonizing work so well without breaking anything in Sage. Now If you think there is a better/faster way. Please give it a try.
I had the impression from C that this inline doesn't do anything if the function is defined in a different module. More precisely, in C/C++ inline only work if the function is defined in
.h
instead of.c
. Does Cython do anything for that ?I expected it to be inlined, if it is defined cpdef inline in the pxd file, and then cimported into another cython file. But I don't know if that is really done, actually.
I'll look at the compiled result.
Replying to @hivert:
Except that at some point Nicolas suggested to keep NestedClassMetaclass for Categories...
I suggest to keep it as well. I just want to cdef it.
Attachment: trac_12808-classcall_speedup-fh.patch.gz
Tentative patch
Description changed:
---
+++
@@ -2,4 +2,4 @@
__Apply__
-[attachment: trac_12808-classcall_speedup2.patch](https://github.com/sagemath/sage/files/ticket12808/trac_12808-classcall_speedup2.patch.gz)
+[attachment: trac_12808-classcall_speedup-fh.patch](https://github.com/sagemath/sage-prod/files/10655299/trac_12808-classcall_speedup-fh.patch.gz)
How has the patch changed?
Replying to @simon-king-jena:
How has the patch changed?
Oups ! Sorry I forgot to click "submit changes". Simon, your patch is broken. It is missing classcall_metaclass.pxd
and as a consequence make Sage segfault as startup. I just added the file compared to your patch.
Florent
Replying to @hivert:
Oups ! Sorry I forgot to click "submit changes". Simon, your patch is broken. It is missing
classcall_metaclass.pxd
and as a consequence make Sage segfault as startup. I just added the file compared to your patch.
Aha! So, I forgot hg add
at some point.
It is relatively easy to change nested_class.py into nested_class.pyx (though not totally trivial). But I was not able to change NestedClassMetaclass
into a cdef class.
Apparently, the problem is that during initialisation of a cdef class (derived from type, at least) the name and the module of that class are not known yet. Hence, a crash in nested_pickle, which is called during initialisation. However, if one determines name and module in a different way and passes it directly to nested_pickle, then there is a different crash, and here I don't understand what is wrong, yet.
I notice that the crash seems to be related with importing the nested class examples in the nested_class module. Perhaps one needs to avoid any application of nested classes in the nested_class module, and move the examples to a different module?
Replying to @simon-king-jena:
Perhaps one needs to avoid any application of nested classes in the nested_class module, and move the examples to a different module?
Nope, doesn't work.
So, currently, it seems (easily) possible to change nested_class.py into nested_class.pyx (even so that all tests pass), but impossible to change class NestedClassMetaclass
into cdef class NestedClassMetaclass
. Hence, probably your approach is the way to go: Introduce a new cdef class, implement fast cython methods there, and then let ClasscallMetaclass
double inherit from the new cdef class and from NestedClassMetaclass
.
I still think that a meta-metaclass could do the same job automatically (namely: We implement various cdef classes, and the meta-metaclass combines them and turns them into a metaclass). But that is clearly a different topic.
It seems that indeed it is impossible to use a cdef class as metaclass - see the example that I gave at cython-users.
Hence, I guess your indirect solution (namely to define fast methods in a cdef class ClasscallType
and then derive from it a usual Python class ClasscallMetaclass
) is probably the only feasible.
Replying to @simon-king-jena:
It seems that indeed it is impossible to use a cdef class as metaclass - see the example that I gave at cython-users.
For the record, It is possible ! I answered on the cython-users thread.
Florent
Replying to @hivert:
For the record, It is possible ! I answered on the cython-users thread.
Indeed! It seems that changing NestedClassMetaclass
into a cdef class with the help of your trick works fine.
By the way, the __init__
of nested classes does nothing more than calling nested_pickle
, but this does nothing more than to call modify_for_nested_pickle
. That should thus be simplified, and perhaps even cpdef'd!
Also, when nested_pickle
calls modify_for_nested_pickle
, it always looks up sys.modules[...]
- thus, why not store sys.modules
in a cdef dict
variable?!
I only touched the nested classes, not ClasscallMetaclass
, and the test suite isn't completed yet, but most seems to pass.
What next? As I have pointed out, it seems more straight forward to cdef both NestedClassMetaclass
and its subclass ClasscallMetaclass
directly, not via an additional ClasscallType
. How should it be organised?
Variant 1: I produce a patch that does the changes in nested_class, and includes most of the changes from your patch (but not ClasscallType
). Then I post it here, and we cross-review.
Variant 2: I review your patch from here, and do my changes (which would revert your changes in the bases of ClasscallMetaclass
) on a different ticket.
What do you prefer?
Hi Simon,
Also, when
nested_pickle
callsmodify_for_nested_pickle
, it always looks upsys.modules[...]
- thus, why not storesys.modules
in acdef dict
variable?!
Does this give a large speed-up ? I would guess that the dict search is the bottleneck, but without serious profiling one cannot be sure.
I only touched the nested classes, not
ClasscallMetaclass
, and the test suite isn't completed yet, but most seems to pass.What next? As I have pointed out, it seems more straight forward to cdef both
NestedClassMetaclass
and its subclassClasscallMetaclass
directly, not via an additionalClasscallType
. How should it be organised?
Sure ! ClasscallType was for me a step before doing more in depth change in the Metaclass infrastructure. I was quite scared to do those in depth change in Sage/Python and feared a lot of hard to debug segfaults. So I did the check by very small steps changing the less possible Sage files. Fortunately, it went quite smooth. I planned to keep the rest of the metaclass infrastructure for another patch but if you think you cdefing NestedClassMetaclass is easily done, we can do both at once an thus the need of ClasscallType vanishes.
Variant 1: I produce a patch that does the changes in nested_class, and includes most of the changes from your patch (but not
ClasscallType
). Then I post it here, and we cross-review.Variant 2: I review your patch from here, and do my changes (which would revert your changes in the bases of
ClasscallMetaclass
) on a different ticket.What do you prefer?
As you wish ! I most probably wont be working on that ticket anymore, unless for addressing some reviewer remark or to help you if you need. The two variants are equal to me and you can consider yourself the owner of that ticker if you wish. We just need to have some clean review. So if you go for variant 2, I think you first need to positive review my change, and then revamp them and I will review you additions.
So please shoot as you prefer ;-)
Florent
Hi Florent,
Replying to @hivert:
Also, when
nested_pickle
callsmodify_for_nested_pickle
, it always looks upsys.modules[...]
- thus, why not storesys.modules
in acdef dict
variable?!Does this give a large speed-up ? I would guess that the dict search is the bottleneck, but without serious profiling one cannot be sure.
Let's test:
sage: cython("""
....: import sys
....: cdef D = sys.modules
....: d = sys.modules
....: def test1(name):
....: cdef int i
....: for i from 0<=i<1000000:
....: a = D[name]
....: def test2(name):
....: cdef int i
....: for i from 0<=i<1000000:
....: a = d[name]
....: def test3(name):
....: cdef int i
....: for i from 0<=i<1000000:
....: a = sys.modules[name]
....: """)
sage: name = "sage"
sage: %time test1(name)
CPU times: user 0.02 s, sys: 0.00 s, total: 0.02 s
Wall time: 0.02 s
sage: %time test2(name)
CPU times: user 0.08 s, sys: 0.00 s, total: 0.08 s
Wall time: 0.08 s
sage: %time test3(name)
CPU times: user 0.11 s, sys: 0.00 s, total: 0.11 s
Wall time: 0.12 s
""")
Conclusion: Assigning sys.modules
to a Python variable brings a speed-up of 1/3, and assigning it to a Cython variable brings a speed-up of 5/6. So, the speed-up obtained from Cython is indeed huge.
Sure ! ClasscallType was for me a step before doing more in depth change in the Metaclass infrastructure. I was quite scared to do those in depth change in Sage/Python and feared a lot of hard to debug segfaults.
Fear thee not!
I planned to keep the rest of the metaclass infrastructure for another patch but if you think you cdefing NestedClassMetaclass is easily done, we can do both at once an thus the need of ClasscallType vanishes.
Yes, it is easily done - if one knows what to cimport. How did you find out?
"Easy" means that one can really just change def into cdef. And in addition to that, I tried to call some frequently used routine directly, in order to avoid a calling overhead, and also I cpdefined that routine.
Variant 1: I produce a patch that does the changes in nested_class, and includes most of the changes from your patch (but not
ClasscallType
). Then I post it here, and we cross-review.Variant 2: I review your patch from here, and do my changes (which would revert your changes in the bases of
ClasscallMetaclass
) on a different ticket.What do you prefer?
As you wish ! I most probably wont be working on that ticket anymore, unless for addressing some reviewer remark or to help you if you need. The two variants are equal to me and you can consider yourself the owner of that ticker if you wish. We just need to have some clean review.
My preference is to do it on one ticket (the common topic being "cdefine ClasscallMetaclass
", which of course also involves cdefining NestedClassMetaclass
).
Concerning clean review: I would provide a patch that is to be applied before your patch (dealing with nested classes) and another one to be applied after your patch (refactoring the bases of classcall metaclass).
Hence, we can easily cross-review, which is considered to be "clean".
Hi Simon,
Conclusion: Assigning
sys.modules
to a Python variable brings a speed-up of 1/3, and assigning it to a Cython variable brings a speed-up of 5/6. So, the speed-up obtained from Cython is indeed huge.
Cool ! So my intuition where wrong. This is something I don't feel very comfortable with Python. When I try to predict which part of the code should be optimized I'm too often wrong. I remember being much better at that in MuPAD. I have the impression that Python is not very predictable, but maybe it's just me. I guess I need to optimize more code.
Sure ! ClasscallType was for me a step before doing more in depth change in the Metaclass infrastructure. I was quite scared to do those in depth change in Sage/Python and feared a lot of hard to debug segfaults.
Fear thee not!
;-)
I planned to keep the rest of the metaclass infrastructure for another patch but if you think you cdefing NestedClassMetaclass is easily done, we can do both at once an thus the need of ClasscallType vanishes.
Yes, it is easily done - if one knows what to cimport. How did you find out?
They is an example (writen by Robert I think) in Parent. Googling "Cython subclassing builtin" gives some good examples:
I guess I was lucky too !
"Easy" means that one can really just change def into cdef. And in addition to that, I tried to call some frequently used routine directly, in order to avoid a calling overhead, and also I cpdefined that routine.
No weird segfault and hardcore debugging. This is easy enough ! Cool !
My preference is to do it on one ticket (the common topic being "cdefine
ClasscallMetaclass
", which of course also involves cdefiningNestedClassMetaclass
).Concerning clean review: I would provide a patch that is to be applied before your patch (dealing with nested classes) and another one to be applied after your patch (refactoring the bases of classcall metaclass).
Hence, we can easily cross-review, which is considered to be "clean".
Excellent. I'm waiting for your input. I'll have time to review it at Sage Days 38 if not before.
Florent
We suggested the following approaches towards getting fast metaclasses.
Here, I study the question whether we should expect a significant speed difference in the two approaches.
Let the following be put in a pyx file that we attach to a Sage session:
cdef class T1:
def __call__(self, x):
return x
cdef class T2:
def __hash__(self):
return 123
cdef class T3:
def __add__(self, x):
return self
cdef class T4:
def test(self, x):
return x
Hence, we have Cython classes, the first three of them providing typical "magical" methods, the fourth a custom method. In the Sage session, we derive a Python class from the four classes, and, to make it more "realistic", we give a second base.
sage: C1 = type("C1",(T1,Parent),{})
sage: C2 = type("C2",(T2,Parent),{})
sage: C3 = type("C3",(T3,Parent),{})
sage: C4 = type("C4",(T4,Parent),{})
C1,...,C4 correspond to metaclasses implemented as you suggest, T1,...,T4 as I suggest.
It could be that in future we want to have meta-metaclasses, which would allow to mix the metaclasses. Hence, I am also considering new classes derived from C1,...,C4 resp. from T1,...,T4:
sage: Cdirect = type("Cdirect",(T1,T2,T3,T4,Parent),{})
sage: Cindirect = type("Cindirect",(C1,C2,C3,C4),{})
Let us create instances of each class:
sage: c1 = C1()
sage: c2 = C2()
sage: c3 = C3()
sage: c4 = C4()
sage: t1 = T1()
sage: t2 = T2()
sage: t3 = T3()
sage: t4 = T4()
sage: ci = Cindirect()
sage: cd = Cdirect()
Now for the timings. First, the call method:
sage: timeit("a = c1(3)", number=300000)
300000 loops, best of 3: 280 ns per loop
sage: timeit("a = t1(3)", number=300000)
300000 loops, best of 3: 277 ns per loop
sage: timeit("a = ci(3)", number=300000)
300000 loops, best of 3: 279 ns per loop
sage: timeit("a = cd(3)", number=300000)
300000 loops, best of 3: 278 ns per loop
Next, the hash:
sage: timeit("a = hash(c2)", number=300000)
300000 loops, best of 3: 87.6 ns per loop
sage: timeit("a = hash(t2)", number=300000)
300000 loops, best of 3: 85.9 ns per loop
sage: timeit("a = hash(ci)", number=300000)
300000 loops, best of 3: 83.6 ns per loop
sage: timeit("a = hash(cd)", number=300000)
300000 loops, best of 3: 99.7 ns per loop
Arithmetics:
sage: timeit("a = c3+c3", number=300000)
300000 loops, best of 3: 149 ns per loop
sage: timeit("a = t3+t3", number=300000)
300000 loops, best of 3: 83 ns per loop
sage: timeit("a = ci+ci", number=300000)
300000 loops, best of 3: 77.8 ns per loop
sage: timeit("a = cd+cd", number=300000)
300000 loops, best of 3: 77.4 ns per loop
And a non-magical method:
sage: timeit("a = c4.test(4)", number=300000)
300000 loops, best of 3: 327 ns per loop
sage: timeit("a = t4.test(4)", number=300000)
300000 loops, best of 3: 285 ns per loop
sage: timeit("a = ci.test(4)", number=300000)
300000 loops, best of 3: 444 ns per loop
sage: timeit("a = cd.test(4)", number=300000)
300000 loops, best of 3: 364 ns per loop
Summary:
For the magical methods, there is no significant speed difference between our suggestions (with exception of addition, but that result looks strange, and I don't really believe it). This would even hold if (in future) we would consider to create metaclasses "dynamically" (i.e., in the same way as I have created Cindirect
and Cdirect
above).
It only pays off to directly implement stuff in Cython, if non-magical methods are concerned. But I doubt that we want non-magical methods for metaclasses.
So, the decision between our suggestions must rely on different reasons: Which one is simpler and easier to maintain? The plus of my suggestion is that it is direct, i.e., there are less layers of indirection. The plus of your suggestion is that it works without creating an extension type for builtin types.
After a break, I will attach my two patches, and then the discussion can start...
Attachment: trac_12808-classcall_cdef.patch.gz
Make ClasscallMetaclass an extension type of type
Changed author from Florent Hivert to Florent Hivert, Simon King
Description changed:
---
+++
@@ -2,4 +2,10 @@
__Apply__
-[attachment: trac_12808-classcall_speedup-fh.patch](https://github.com/sagemath/sage-prod/files/10655299/trac_12808-classcall_speedup-fh.patch.gz)
+* [attachment: trac_12808-classcall_speedup-fh.patch](https://github.com/sagemath/sage-prod/files/10655299/trac_12808-classcall_speedup-fh.patch.gz)
+* [attachment: trac_12808_nested_class_cython.patch](https://github.com/sagemath/sage-prod/files/10655301/trac_12808_nested_class_cython.patch.gz)
+
+and we have to decide whether the third patch helps to make the code clearer:
+
+* [attachment: trac_12808-classcall_cdef.patch](https://github.com/sagemath/sage-prod/files/10655300/trac_12808-classcall_cdef.patch.gz)
+
I have attached my two patches. Your patch and my patch actually turn out to be independent.
Purpose of my first patch: Make NestedClassMetaclass
an extension type of type, and avoid some calling overhead during its creation.
Purpose of my second patch: Make ClasscallMetclass
an extension type as well, directly derived from NestedClassMetaclass
.
Timings
With sage-5.1.notebook unpatched, I get for Florent's examples
sage: class Rien(object):
....: pass
....:
sage: from sage.misc.classcall_metaclass import ClasscallMetaclass
sage: class NOCALL(object):
....: __metaclass__ = ClasscallMetaclass
....: pass
....:
sage: %timeit [Rien() for i in range(10000)]
125 loops, best of 3: 1.74 ms per loop
sage: %timeit [NOCALL() for i in range(10000)]
25 loops, best of 3: 16.7 ms per loop
sage: class CALL(object):
....: __metaclass__ = ClasscallMetaclass
....: @staticmethod
....: def __classcall_private__(cls, arg):
....: arg = arg + arg
....: return arg
....:
sage: %timeit [CALL(i) for i in range(10000)]
25 loops, best of 3: 9.05 ms per loop
Here is something with __classcall__
instead of __classcall_private__
, and in a way that has less overhead than arg+arg
:
sage: class NewCall(object):
....: __metaclass__ = ClasscallMetaclass
....: @staticmethod
....: def __classcall__(cls, C):
....: return C
....:
sage: C = ZZ.__class__
sage: timeit("a = NewCall(C)", number=10000)
10000 loops, best of 3: 878 ns per loop
And finally a "nested class" example:
sage: from sage.misc.nested_class import NestedClassMetaclass
sage: def test_nest():
....: class A:
....: __metaclass__ = NestedClassMetaclass
....: class B:
....: pass
....:
sage: %timeit test_nest()
625 loops, best of 3: 33.1 µs per loop
Now, the same examples with my first patch:
sage: %timeit [Rien() for i in range(10000)]
125 loops, best of 3: 1.76 ms per loop
sage: %timeit [NOCALL() for i in range(10000)]
25 loops, best of 3: 18.3 ms per loop
sage: %timeit [CALL(i) for i in range(10000)]
25 loops, best of 3: 10.7 ms per loop
sage: %timeit test_nest()
625 loops, best of 3: 23.6 µs per loop
Now, with your patch only:
sage: %timeit [Rien() for i in range(10000)]
125 loops, best of 3: 1.77 ms per loop
sage: %timeit [NOCALL() for i in range(10000)]
125 loops, best of 3: 2.05 ms per loop
sage: %timeit [CALL(i) for i in range(10000)]
125 loops, best of 3: 4.34 ms per loop
sage: timeit("a = NewCall(C)", number=10000)
10000 loops, best of 3: 889 ns per loop
sage: %timeit test_nest()
625 loops, best of 3: 32.8 µs per loop
Thus, our patches treat orthogonal aspects. Now, the first two patches together:
sage: %timeit [Rien() for i in range(10000)]
125 loops, best of 3: 1.78 ms per loop
sage: %timeit [NOCALL() for i in range(10000)]
125 loops, best of 3: 2 ms per loop
sage: %timeit [CALL(i) for i in range(10000)]
125 loops, best of 3: 4.34 ms per loop
sage: timeit("a = NewCall(C)", number=10000)
10000 loops, best of 3: 895 ns per loop
sage: %timeit test_nest()
625 loops, best of 3: 23.6 µs per loop
And with all three patches:
sage: %timeit [Rien() for i in range(10000)]
125 loops, best of 3: 1.75 ms per loop
sage: %timeit [NOCALL() for i in range(10000)]
125 loops, best of 3: 2.08 ms per loop
sage: %timeit [CALL(i) for i in range(10000)]
125 loops, best of 3: 4.44 ms per loop
sage: timeit("a = NewCall(C)", number=10000)
10000 loops, best of 3: 897 ns per loop
sage: %timeit test_nest()
625 loops, best of 3: 23.6 µs per loop
CONCLUSION
My first patch does improve the time spent for the creation of a nested class. Your patch improves a lot of things for classcall metaclass. My second patch makes (I think) the inheritance a bit clearer, but it does not provide a speed-up. The reason is explained in my previous post.
Apply trac_12808-classcall_speedup-fh.patch trac_12808_nested_class_cython.patch trac_12808-classcall_cdef.patch
Hi Simon,
Thanks a lot for this in depth timing measurement !
My first patch does improve the time spent for the creation of a nested class.
Excellent !
Your patch improves a lot of things for classcall metaclass. My second patch makes (I think) the inheritance a bit clearer, but it does not provide a speed-up. The reason is explained in my previous post.
I fully agree that it is clearer. There may be however some speed penalty with
getting rid of ClasscallType
if we need at some point to construct a lot
of classes which need the classcall trick but without nested classes. I don't
know if this usecase is realistic at all. Do you have any idea on this
question ? I don't even know if the current dynamic class allows it. Anyway,
there won't be any overhead from the point of view of the instances creation
and, manipulation but there may be when creating the classes
themselves. Therefore I think we should measure the relative price of creating
classes which ClasscallType
vs ClasscallMetaclass
. I'll
investigate this while reviewing your patch.
So my current opinion is: we should do the timing while we can easily do
it. Then if nothing is measurable with the current Sage usage, but measurable
with some weird one, we should remove ClasscallType
(any unused code and
moreover confusing code shouldn't remain) keeping somewhere a note saying that
if the weird usecase appear (which I don't really believe) we should separate
ClasscallType
vs ClasscallMetaclass
.
Cheers,
Florent
Dear Florent,
Replying to @hivert:
I fully agree that it is clearer. There may be however some speed penalty with getting rid of
ClasscallType
if we need at some point to construct a lot of classes which need the classcall trick but without nested classes.
Well, the problem is as I have explained in the sage-combinat-devel thread. Some classes would actually not use more than the __call__
method (so, morally, ClasscallMetaclass
should only do this one thing), while some would only need nesting or another __init__
(so, NestedClassMetaclass
should only do this one thing), and others only need a custom __reduce__
method (so, DynamicMetaclass
should only to this one thing).
However, in typical applications, we want to construct a class that simultaneously derives from classes with different metaclasses. That is only possible if we have metaclasses that combine different features. Hence, ClasscallMetaclass
inherits from NestedClassMetaclass
(although it often does not need nesting), and there is an additional DynamicClasscallMetaclass
, just to cope with dynamic classes that have a classcall.
I'd really like to have these combined metaclasses created automatically, hence, using a meta-metaclass.
Concerning time penalty: As I have demonstrated above, there is no time penalty for special methods of a class C that are inherited from different Cython classes - even if the inheritance is indirect. A speed penalty only seems to occur, if one has an indirect inheritance of non-magical methods.
Hence, if C is actually used as a metaclass, I would expect that both approaches (defining the fast magical methods in ClasscallType
and other auxiliar types or directly in ClasscallMetaclass
and other metaclasses) are equal, speed-wise.
I don't know if this usecase is realistic at all.
See above: There is need for combined metaclasses, even though in some applications a pure one-trick metaclass would be enough.
Anyway, there won't be any overhead from the point of view of the instances creation and, manipulation but there may be when creating the classes themselves.
Well, the timings that I presented were for a class C, and I measured some methods (e.g. __call__
) of its instances. However, I think the same observations will hold if C actually is a metaclass, and we would measure __call__
etc. of its instances (which are classes).
But of course the time for creating (not just for calling) the instances matters as well.
Cheers,
Simon
Replying to @simon-king-jena:
But of course the time for creating (not just for calling) the instances matters as well.
In my previous posts, I have demonstrated that some "magical" methods are equally fast for a cdef class and for a python class that inherits from the cdef class. See __call__
, __hash__
and __add__
. I only found a speed penalty for inherited non-magical methods.
However, the creation of instances is indeed A LOT faster for cdef classes than for Python classes. Recall the definition of T1, ..., T4, C1, ..., C4, Cdirect and Cindirect. We get
sage: timeit("a = T1()", number=300000)
300000 loops, best of 3: 102 ns per loop
sage: timeit("a = T2()", number=300000)
300000 loops, best of 3: 102 ns per loop
sage: timeit("a = T3()", number=300000)
300000 loops, best of 3: 102 ns per loop
sage: timeit("a = T4()", number=300000)
300000 loops, best of 3: 102 ns per loop
sage: timeit("a = C1()", number=300000)
300000 loops, best of 3: 16.9 µs per loop
sage: timeit("a = C2()", number=300000)
300000 loops, best of 3: 17 µs per loop
sage: timeit("a = C3()", number=300000)
300000 loops, best of 3: 17.2 µs per loop
sage: timeit("a = C4()", number=300000)
300000 loops, best of 3: 17 µs per loop
sage: timeit("a = Cdirect()", number=300000)
300000 loops, best of 3: 18.8 µs per loop
sage: timeit("a = Cindirect()", number=300000)
300000 loops, best of 3: 20.1 µs per loop
According to these timings, I would expect that the creation of classes with an actual cdef metaclass (as with my second patch) is faster than the creation of classes with a Python metaclass that inherits from a cdef class (like ClasscallType
). Let's test:
sage: cython("""
....: from sage.misc.classcall_metaclass import ClasscallMetaclass
....: def test_creation():
....: class A:
....: __metaclass__ = ClasscallMetaclass
....: @staticmethod
....: def __classcall__(cls, x):
....: return x
....: """)
Without patches:
sage: timeit("test_creation()", number=10000)
10000 loops, best of 3: 41.7 µs per loop
With your patch only:
sage: timeit("test_creation()", number=10000)
10000 loops, best of 3: 45.8 µs per loop
With the first two patches:
sage: timeit("test_creation()", number=10000)
10000 loops, best of 3: 38.5 µs per loop
With all three patches:
sage: timeit("test_creation()", number=10000)
10000 loops, best of 3: 38.5 µs per loop
And when cimporting the metaclass (which of course only works with all three patches):
sage: cython("""
....: from sage.misc.classcall_metaclass cimport ClasscallMetaclass
....: def test_creation():
....: class A:
....: __metaclass__ = ClasscallMetaclass
....: @staticmethod
....: def __classcall__(cls, x):
....: return x
....: """)
sage: timeit("test_creation()", number=10000)
10000 loops, best of 3: 38.2 µs per loop
In other words: It seems that directly cdefining the metaclasses is not only clearer than the use of ClasscallType
, but it is indeed a little (but not much) faster.
Hi Simon,
Thanks again for those very precise timing measures. I'm sorry I wasn't completely clear on the timing I wanted to have. Here is what I hope is a clearer explanation:
There is a common usage of ClasscallMetaclass
and
__classcall_private__
for elements in order to delegate the creation to
the parent. For example, the call
sage: PerfectMatching([(1,2),(3,4)])
PerfectMatching [(1, 2), (3, 4)]
actually create the parent:
sage: P4 = PerfectMatchings(4); P4
Set of perfect matchings of {1, 2, 3, 4}
And calls
sage: P4([(1,2),(3,4)])
PerfectMatching [(1, 2), (3, 4)]
This is done by a classcall trick. There isn't currently a lot of this trick in Sage library but many more are coming from the Sage-combinat queue.
Now I don't see any element class having a nested class. So in an ideal world,
we'd like to have PerfectMatching
in ClasscallMetaclass
but not in
NestedClassMetaclass
. From your timings, I clearly see that being in
NestedClassMetaclass
doesn't impact creation of the element, but I was
wondering how it impact the creation of the class element_class
itself. As I said, I don't currently see any usecase where we will create a
lot of element_class
but who knows...
So here are the timings:
sage: from sage.misc.classcall_metaclass import ClasscallType
sage: from sage.structure.element import Element
sage: %timeit type('toto', (Element,), {})
625 loops, best of 3: 25.9 µs per loop
sage: %timeit ClasscallType('toto', (Element,), {})
625 loops, best of 3: 31 µs per loop
With your patch cdefing NestedclassMetaclass
, there is a slight overhead
in being in NestedClassMetaclass
:
sage: %timeit ClasscallMetaclass('toto', (Element,), {})
625 loops, best of 3: 36.9 µs per loop
The overhead is quite similar when NestedclassMetaclass
is not cdef:
sage: %timeit ClasscallMetaclass('toto', (Element,), {})
625 loops, best of 3: 37.2 µs per loop
Now this case is not very realistic since there are no methods in the new class. Let's put some by getting a realistic dictionary:
sage: dct = dict(PerfectMatching.__dict__)
sage: len(dct)
24
With a cdef NestedclassMetaclass
sage: %timeit type('toto', (Element,), dct)
625 loops, best of 3: 27 µs per loop
sage: %timeit ClasscallType('toto', (Element,), dct)
625 loops, best of 3: 31.7 µs per loop
sage: %timeit ClasscallMetaclass('toto', (Element,), dct)
625 loops, best of 3: 48.9 µs per loop
Without a cdef NestedclassMetaclass
sage: %timeit type('toto', (Element,), dct)
625 loops, best of 3: 28.8 µs per loop
sage: %timeit ClasscallType('toto', (Element,), dct)
625 loops, best of 3: 32.5 µs per loop
sage: %timeit ClasscallMetaclass('toto', (Element,), dct)
625 loops, best of 3: 50.6 µs per loop
As a conclusion in some realistic usecases: the slowdown for having
ClasscallMetaclass
systematically inheriting from
NestedclassMetaclass
, that is no ClasscallType
is 48.9 µs vs 31.7
µs. This is non negligible (after all calling the
modify_for_nested_pickle
indeed has a cost) but much less that I
expected. Since I see no realistic usage in creating a lot of such classes, I
think I'm Ok with your second patch. I'd like to have Nicolas opinion on that,
since he is the one that raised this question.
I'm reviewing your two patches (after having lunch).
Florent
Replying to @hivert:
There is a common usage of
ClasscallMetaclass
and__classcall_private__
for elements in order to delegate the creation to the parent. ... Now I don't see any element class having a nested class. So in an ideal world, we'd like to havePerfectMatching
inClasscallMetaclass
but not inNestedClassMetaclass
.
Good example! That's exactly along the lines of the sage-combinat thread I started: We have common use cases in which we want to combine the nested class feature with the classcall feature, and thus someone has decided to make classcall inherit from nested class, although there are examples in which this is not needed.
Instead, one could either create a pure classcall metaclass on the one hand, and a combined ClasscallNestedClassMetaclass
on the other hand: This is similar to the DynamicClasscallMetaclass
, which is a combined class for DynamicMetaclass
and ClasscallMetaclass
. Of course, having "pure" ClasscallMetaclass
, NestedClassMetaclass
and DynamicMetaclass
, we would need to create four combined metaclasses to cover all potential use cases.
Or: We could have a meta-metaclass that creates the combined metaclasses automatically...
When a class
C
is an instance ofClasscallMetaclass
, then for any constructor call, the system currently looks for__classcall__
and__classcall_private__
inC
. This adds quite an overhead and this could be cached, assuming no one modifiesC
(which seems reasonable). Improving this speeds up, in particular, all call to the constructor of a subclass ofUniqueRepresentation
, eg. many parents, all categories...Apply
CC: @nthiery @simon-king-jena
Component: misc
Keywords: classcall UniqueRepresentation
Author: Florent Hivert, Simon King
Reviewer: Simon King, Florent Hivert
Merged: sage-5.1.beta0
Issue created by migration from https://trac.sagemath.org/ticket/12808