Closed darijgr closed 10 years ago
Description changed:
---
+++
@@ -2,7 +2,14 @@
http://www.sagemath.org/doc/reference/misc/sage/misc/classcall_metaclass.html
-One thing that should be added is some explanation of the difference between `__classcall__` and `__classcall_private__`.
+One thing I don't understand is:
+
+```
+the preprocessing on the arguments should be idempotent. Namely, If MyClass2.__classcall__ calls CachedRepresentation.__classcall__(<some_arguments>), then it should accept <some_arguments> as its own input, and pass it down unmodified to CachedRepresentation.__classcall__().
+```
+What is meant by "accept <some_arguments> as its own input"?
+
+Something that possibly should be added is some explanation of the difference between `__classcall__` and `__classcall_private__`. Maybe it doesn't belong there, but then again I have no idea where it belongs as it seems not documented at all...
Also there seems to be a typo:
Description changed:
---
+++
@@ -1,6 +1,6 @@
Sadly not my love, unless someone teaches me enough of the stuff that I understand it myself.
-http://www.sagemath.org/doc/reference/misc/sage/misc/classcall_metaclass.html
+http://www.sagemath.org/doc/reference/structure/sage/structure/unique_representation.html
One thing I don't understand is:
Replying to @darijgr:
I think it's UniqueFactory, not UniqueRepresentation, which is easy to use etc.
You couldn't be more mistaken, I think. In the wast majority of cases, if you have defined a class C,
class C(base1, base2):
code
then you only need to change it into
class C(UniqueRepresentation, base1, base2):
code
and have caching and pickling for free. Setting up a UniqueFactory
is a lot more work.
Something else is wrong then, because the example directly under the paragraph I quoted is about GF, which is a UniqueFactory.
Replying to @darijgr:
Something else is wrong then, because the example directly under the paragraph I quoted is about GF, which is a UniqueFactory.
OK, if this is the case, then indeed it is a bad example.
One question, though: What do you mean by UniqueRepresentation
tutorial? Is there a separate tutorial, or is it just the pages for sage.structure.factory and sage.structure.unique_representation, respectively?
Oops, what I meant is really just the docstring for UniqueRepresentation
in sage/structure/unique_representation.py
.
I think both sage.structure.unique_representation
and sage.structure.factory
need more care.
Certainly it is a severe fault that GF(...)
appears as example of UniqueRepresentation
.
And I think both parts of the documentation should give hints on when to use UniqueRepresentation
and when to use UniqueFactory
.
For example, a UniqueFactory
can return different implementations of the same algebraic structure, depending on the input data. This would be impossible (or at least not so easily possible) with UniqueRepresentation
. And you would not be able to use UniqueRepresentation
with a cdef class---you can define your class in a pyx file, but it has to be class Foo(UniqueRepresentation, ...)
, not cdef class Foo(UniqueRepresentation, ...)
, simply since UniqueRepresentation
uses meta-classes, which is not possible in a cdef class, if I am not mistaken (and of course, multiple inheritance does not work yet in Cython, and after all UniqueRepresentation
is a python class.
These are reasons for using a factory.
On the other hand, using UniqueRepresentation
would automatically provide your class with very fast hash and comparison methods implemented in Cython (UniqueRepresentation
is a Python class, but also inherits from the Cython class sage.misc.fast_methods.WithEqualityById
). If you use a factory instead, you'd need to take care of hash and comparison all by yourself---and this means: It is possible that a UniqueFactory
returns objects that are not unique parents!! And moreover, using UniqueRepresentation
is very easy---in most cases, it is enough to take an existing class and add UniqueRepresentation
as base.
So, these are reasons for using a unique representation.
Replying to @simon-king-jena:
Certainly it is a severe fault that
GF(...)
appears as example ofUniqueRepresentation
.
Or not? I now see that GF(...)
appears in order to demonstrate the design pattern that is common to both unique representation and unique factory. And it is clearly stated that one would like to have something like
isinstance(GF(p), GF)
which is not possible with a unique factory but is the case for UniqueRepresentation
.
I think it is a good idea to have GF mentioned both in the documentation of unique representation and unique factory, since it explains one detail one can base one's choice on.
Namely, as an example of a behaviour one can not get with UniqueRepresentation
:
sage: type(GF(5))
<class 'sage.rings.finite_rings.finite_field_prime_modn.FiniteField_prime_modn_with_category'>
sage: type(GF(25,'x'))
<class 'sage.rings.finite_rings.finite_field_givaro.FiniteField_givaro_with_category'>
sage: type(GF(5^10,'x'))
<class 'sage.rings.finite_rings.finite_field_ext_pari.FiniteField_ext_pari_with_category'>
sage: type(GF(2^20,'x'))
<class 'sage.rings.finite_rings.finite_field_ntl_gf2e.FiniteField_ntl_gf2e_with_category'>
Questions:
sage.structure.factory
nor sage.structure.unique_representation
have a meaningful documentation at module level. Would it make sense to give documentation at module level that helps for choosing among the two models of creating a unique parent behaviour? Or should this be only provided by CachedRepresentation
and UniqueRepresentation
resp. by UniqueFactory
?__classcall__
and __classcall_private__
unclear. There is an example clarifying the distinction, but it is hidden in the documentation of _clear_cache_()
, which is an underscore method and is thus not visible in the reference manual. Do you think that the example from _clear_cache_()
is clear? Then we could promote it (or a similar example) to a visible place.I fear the doc of _clear_cache_()
is beyound my comprehension... Is the difference just in the fact that a __classcall_private__
method will not be inherited by subclasses while a __classcall__
will be? (But I thought all methods would be inherited by subclasses?)
Replying to @darijgr:
I fear the doc of
_clear_cache_()
is beyound my comprehension... Is the difference just in the fact that a__classcall_private__
method will not be inherited by subclasses while a__classcall__
will be? (But I thought all methods would be inherited by subclasses?)
Your question really is about sage.misc.classcall_metaclass.ClasscallMetaclass
, which is where methods such as __classcall__
, __classget__
, __classcontains__
and __classcall_private__
take effect.
Here, we look at sage.misc.classcall_metaclass.ClasscallMetaclass.__call__
, which implements the creation of instances of a class. Of course, if a class C defines __classcall_private__
and a class D inherits from C, then D also has a __classcall_private__
method. However, you can find this method in C.__dict__
but not in D.__dict__
.
And the rule is: If an instance of a class (here: C or D) is created by calling ClasscallMetaclass.__class__
, then
__classcall_private__
in its own __dict__
. If this is the case, then __classcall_private__
will be called. Otherwise,__classcall__
method (which is not necessarily in __dict__
, as it could be inherited from a base class). If this is the case, then __classcall__
will be called.I just said "we look at ClasscallMetaclass.__call__
". Well, in fact we look at CLasscallMetaclass.__cinit__
, because this is where the choice between __classcall__
and __classcall_private__
is made.
What does all this mean for this ticket?
Question: Do you think it would be enough to state in the documentation of CachedRepresentation
that it has the ClasscallMetaclass
and that it provides a __classcall__
implementing the cache, and that one can use __classcall_private__
or __classcall__
as explained in the docs of ClasscallMetaclass
to overload the default way of caching?
Or I guess it would still be a good idea to have examples in place, without the need to read the ClasscallMetaclass
docs...
I have attached a patch. Do you like the new documentation better? What points would you suggest to improve further?
Note that I added documentation to both sage.structure.factory and sage.structure.unique_representation, also adding new tests.
Author: Simon King
I have replaced my patch by an updated version. The new patch version addresses your complaint about "idempotent preprocessing", elaborating a bit more and adding an example that shows what goes wrong if the pre-processing is not idempotent.
I am just re-reading my patch. I find several typos/wrong grammar. In addition, in one of my examples I have a non-idempotent pre-processing. I should change it. Also, by mistake, in one of my examples one finds the version number of my Sage installation---this should of course replaced by "...".
I hope all problems are solved with the new patch version...
Just a couple typos fixed. Feel free to qfold into your patch. This is not a review, sorry (I don't have the competence to do it; I barely understood the UniqueRepresentation part and I know about nothing on Python's OOP).
Attachment: trac_14912-typos-dg.patch.gz
Thanks for the patch, which made several things clear to me (probably not enough to be able to decide whether to use factories or unique rep -- but at least a good idea how to). I fear somebody who actually knows some OOP will have to review it. The above attachment fixes a couple typos.
Hi Darij,
Thank you for fixing the typos!
Replying to @darijgr:
Thanks for the patch, which made several things clear to me (probably not enough to be able to decide whether to use factories or unique rep -- but at least a good idea how to). ... I don't have the competence to do it; I barely understood the
UniqueRepresentation
part and I know about nothing on Python's OOP
In a way, this means that you are fully qualified as a reviewer. This is about documentation, and thus if you think that some points are still not addressed then you should speak up! What information do you find missing? Is it
One thing I'd be happy to know is what a staticmethod is -- but that's hardly a question for this patch.
And yes, some doc on metaclasses and the instantiation process (I'm confused about what goes in __new__
, what in __init__
and what in __classcall__
) would be great.
As for the "why" of cached/unique representation, for me it's pretty clear already, and I think the definition in your doc is very readable.
Thanks a lot for the patience!
Attachment: trac14912-unique_doc.patch.gz
Combined patch, including Darij's fixes
I have folded the two patches, and I have added a note on "static methods", including a link to the Python docs.
I did not elaborate on __new__
versus __init__
versus __classcall__
, yet.
Apply trac14912-unique_doc.patch
Description changed:
---
+++
@@ -18,3 +18,7 @@
I think it's UniqueFactory, not UniqueRepresentation, which is easy to use etc. + +Apply + +attachment: trac14912-unique_doc.patch
Replying to @simon-king-jena:
I have folded the two patches, and I have added a note on "static methods", including a link to the Python docs.
... which was supposed to imply the question whether you find this helpful.
Hey Darij,
This is primarily a ping message asking Darij if he's happy with this ticket, but...
The @staticmethod
is a python thing, which makes a method accessible via ClassName.foo
and doesn't take a first argument of a type as opposed to @classmethod
. If anything, this should belong with the ClasscallMetaclass
documentation, but that's outside the scope of this ticket.
Best,
Travis
Sorry for letting this slip off my to-do list!
I'm pretty sure that I've understood the important points of the doc now except for some that concern pickling (I don't know how it works and frankly I didn't plan on learning that). I'd feel better if I knew what exactly __classcall__
does and (as I said) how it is distinguished from __new__
and __init__
(and if it calls them, in which order), but that might as well be a different patch.
Is the comment I've added correct and is it relevant?
I've replaced "extension class" by "Cython extension-type class", since they seem to be called either "extension types" or "cdef classes" but never "extension classes".
I'd like more details on this:
104 In addition, it is required that a unique factory instance is provided
105 with a name that allows to find its definition.
I assume this refers to the string parameter, e. g., in F = MyFactory("__main__.F")
; but what kind of path should it contain if it is to be used somewhere in a .py file rather than in a doctest? Just F = MyFactory("F")
? What if it is defined inside a class?
Not sure about this:
And
408 with a factory, it is possible to create the resulting instance by arguments
409 that are different from the key used for caching.
Doesn't that also work with CachedRepresentation if one preprocesses by declaring __classcall_private__
?
@tscrim: thank you, too.
Attachment: trac_14912-comments-dg.patch.gz
almost a review
Description changed:
---
+++
@@ -19,6 +19,7 @@
I think it's UniqueFactory, not UniqueRepresentation, which is easy to use etc.
-__Apply__
+Apply:
-[attachment: trac14912-unique_doc.patch](https://github.com/sagemath/sage-prod/files/10658171/trac14912-unique_doc.patch.gz)
+* [attachment: trac14912-unique_doc.patch](https://github.com/sagemath/sage-prod/files/10658171/trac14912-unique_doc.patch.gz)
+* [attachment: trac_14912-comments-dg.patch](https://github.com/sagemath/sage-prod/files/10658172/trac_14912-comments-dg.patch.gz)
Reviewer: Darij Grinberg
Thanks to Travis's insistence and also his help, I've more or less reviewed this one at last. The only part I wasn't able to understand is:
.. NOTE::
For technical reasons, it is needed that ``__classcall__`` respectively
``__classcall_private__`` are "static methods", i.e., they are callable
objects that do not bind to an instance or class. For example, a
:class:`~sage.misc.cachefunc.cached_function` can be used here, because it
is callable, but does not bind to an instance or class, because it has no
``__get__()`` method. A usual Python function, however, has a
``__get__()`` method and would thus under normal circumstances bind to an
instance or class, and thus the instance or class would be passed to the
function as the first argument. To prevent a callable object from being
bound to the instance or class, one can prepend the ``@staticmethod``
decorator to the definition; see :class:`staticmethod`.
Well, maybe it's just that I have no idea what __get__
methods do.
for the patchbot:
apply trac14912-unique_doc.patch trac_14912-comments-dg.patch
Replying to @darijgr:
Sorry for letting this slip off my to-do list!
Same here...
I'm pretty sure that I've understood the important points of the doc now except for some that concern pickling (I don't know how it works and frankly I didn't plan on learning that). I'd feel better if I knew what exactly
__classcall__
does and (as I said) how it is distinguished from__new__
and__init__
(and if it calls them, in which order), but that might as well be a different patch.
In a nutshell: If C
is a class with ClasscallMetaclass
, then __classcall__
(or __classcall_private__
) is what is executed when you do C(*args, **kwds)
.
Your __classcall__
can do anything. Really. It could return an instance of C
, but it could also return something totally different.
Consequence: __new__
or __init__
are only called if you did not forget to call them inside of your __classcall__
. If you look at what CachedRepresentation.__classcall__
does, you see that indeed __new__
and __init__
are called explicitly.
I'd like more details on this:
104 In addition, it is required that a unique factory instance is provided 105 with a name that allows to find its definition.
I assume this refers to the string parameter, e. g., in
F = MyFactory("__main__.F")
;
Exactly.
but what kind of path should it contain if it is to be used somewhere in a .py file rather than in a doctest? Just
F = MyFactory("F")
?
It is exactly as it is stated: A name is needed that allows to find the factory's definition. Hence, if you put F
into the global name space, then F = MyFactory("F")
is fine. If F
is in a module sage.foo.bar
, then it is F = MyFactory("sage.foo.bar.F")
.
What if it is defined inside a class?
If the class is sage.bar.foo.MyClass
and you want to use a factory for the attribute F
of this class, then I guess it is `F = MyFactory("sage.bar.foo.MyClass.F"). Not tested, though.
Not sure about this:
And 408 with a factory, it is possible to create the resulting instance by arguments 409 that are different from the key used for caching.
Doesn't that also work with CachedRepresentation if one preprocesses by declaring
__classcall_private__
?
Yes. This could be clarified.
Hi Darij,
the review patch looks fine. Disclaimer: I did not test whether the documentation builds fine after the changes introduced by the review patch. IIRC, it did build (and look) fine with my patch.
Replying to @darijgr:
Thanks to Travis's insistence and also his help, I've more or less reviewed this one at last. The only part I wasn't able to understand is:
.. NOTE:: For technical reasons, it is needed that ``__classcall__`` respectively ``__classcall_private__`` are "static methods", i.e., they are callable objects that do not bind to an instance or class. For example, a :class:`~sage.misc.cachefunc.cached_function` can be used here, because it is callable, but does not bind to an instance or class, because it has no ``__get__()`` method. A usual Python function, however, has a ``__get__()`` method and would thus under normal circumstances bind to an instance or class, and thus the instance or class would be passed to the function as the first argument. To prevent a callable object from being bound to the instance or class, one can prepend the ``@staticmethod`` decorator to the definition; see :class:`staticmethod`.
Well, maybe it's just that I have no idea what
__get__
methods do.
OK. Then one should add a pointer to the Python references. Travis, Do you have an idea what page to point to?
Thanks for these clarifications!
Replying to @simon-king-jena:
Replying to @darijgr:
I'd like more details on this:
104 In addition, it is required that a unique factory instance is provided 105 with a name that allows to find its definition.
I assume this refers to the string parameter, e. g., in
F = MyFactory("__main__.F")
;Exactly.
but what kind of path should it contain if it is to be used somewhere in a .py file rather than in a doctest? Just
F = MyFactory("F")
?It is exactly as it is stated: A name is needed that allows to find the factory's definition. Hence, if you put
F
into the global name space, thenF = MyFactory("F")
is fine. IfF
is in a modulesage.foo.bar
, then it isF = MyFactory("sage.foo.bar.F")
.What if it is defined inside a class?
If the class is
sage.bar.foo.MyClass
and you want to use a factory for the attributeF
of this class, then I guess it is `F = MyFactory("sage.bar.foo.MyClass.F"). Not tested, though.
Yeah, these things could very well be explained in the doc itself.
Not sure about this:
And 408 with a factory, it is possible to create the resulting instance by arguments 409 that are different from the key used for caching.
Doesn't that also work with CachedRepresentation if one preprocesses by declaring
__classcall_private__
?Yes. This could be clarified.
Kind-of; on the other hand, the preprocessing probably needs to be idempotent for pickling to work well, so we have a serious restriction...
Replying to @simon-king-jena:
Well, maybe it's just that I have no idea what
__get__
methods do.OK. Then one should add a pointer to the Python references. Travis, Do you have an idea what page to point to?
Hey Simon,
My thought would be this: http://docs.python.org/2/howto/descriptor.html (I showed this to Darij the other day).
Attachment: trac_14912-link_tweaks-ts.patch.gz
Description changed:
---
+++
@@ -23,3 +23,4 @@
* [attachment: trac14912-unique_doc.patch](https://github.com/sagemath/sage-prod/files/10658171/trac14912-unique_doc.patch.gz)
* [attachment: trac_14912-comments-dg.patch](https://github.com/sagemath/sage-prod/files/10658172/trac_14912-comments-dg.patch.gz)
+* [attachment: trac_14912-link_tweaks-ts.patch](https://github.com/sagemath/sage/files/ticket14912/c3378bb3c68e9646dd27edd78f02461e.gz)
I'm prepared to set this to positive review (barring my tweaks being review), but I'd like to know what else is desired to be added and where (if anything).
For patchbot;
Apply: trac14912-unique_doc.patch, trac_14912-comments-dg.patch, trac_14912-link_tweaks-ts.patch
Oops, I've just edited Travis's patch instead of qnewing my own. So glad I soon won't have the hg workflow anymore to trip over...
Anyway, here is my "review patch". It is not so much as a review but a more informative documentation of what exactly to do with UniqueFactory
. Now, "more informative" does not necessarily mean "correct", so I'd much prefer someone more experienced than me to look this through (particularly, but not only, checking my interpretation of other_keys
).
I still don't grok the note about the __classcall__
method being static in the unique representation doc, but I didn't really have the time to RTFM about descriptors. If the note is clear to you, Travis, just say so and it will plug this hole in my review.
for the patchbot:
apply trac14912-unique_doc.patch trac_14912-comments-dg.patch trac_14912-more-ts-dg.patch
Description changed:
---
+++
@@ -23,4 +23,4 @@
* [attachment: trac14912-unique_doc.patch](https://github.com/sagemath/sage-prod/files/10658171/trac14912-unique_doc.patch.gz)
* [attachment: trac_14912-comments-dg.patch](https://github.com/sagemath/sage-prod/files/10658172/trac_14912-comments-dg.patch.gz)
-* [attachment: trac_14912-link_tweaks-ts.patch](https://github.com/sagemath/sage/files/ticket14912/c3378bb3c68e9646dd27edd78f02461e.gz)
+* [attachment: trac_14912-more-ts-dg.patch](https://github.com/sagemath/sage-prod/files/10658174/trac_14912-more-ts-dg.patch.gz)
beware of unvisible characters !
apply trac14912-unique_doc.patch trac_14912-comments-dg.patch trac_14912-more-ts-dg.patch
Replying to @fchapoton:
beware of unvisible characters !
Indeed. The third patch introduces some blank space.
Notice that the objects created by the factory inherit neither from
the factory class, nor from the factory instance (which is not even a
class). In fact, they don't even know about the factory that created
them!
Apart from the blank space after the "!" (AFAIK, there is no blank space before "!", ":" and ";", in contrast to French), I am not happy with the statement that the instances don't even know about the factory that created them:
sage: F = GF(5)
sage: F._factory_data
(<class 'sage.rings.finite_rings.constructor.FiniteFieldFactory'>,
(5, 13, 'beta3'),
(5, None, None, 'modn', '{}', 5, 1, True),
{})
sage: _[0] is GF
True
This actually is how pickling is implemented (namely: By keeping a reference to the creating factory, plus the arguments used to call the factory).
The docbuild gives warnings:
dochtml.log:[structure] /scratch/release/merger/sage-5.13.beta5/local/lib/python2.7/site-packages/sage/structure/unique_representation.py:docstring of sage.structure.unique_representation:127: WARNING: Block quote ends without a blank line; unexpected unindent.
dochtml.log:[structure] /scratch/release/merger/sage-5.13.beta5/local/lib/python2.7/site-packages/sage/structure/unique_representation.py:docstring of sage.structure.unique_representation:129: ERROR: Unexpected indentation.
Fixed false claim about factory-created objects being oblivious of their making (thanks Simon!). Couldn't find the blank space after "!" claimed by Simon, but removed some other useless whitespaces. HOPEFULLY fixed Jeroen's docbuild warnings. Made some mini-fixes to unique_representation, including changing "argument" to "arguments" in the __reduce__
docstring.
Thanks Frederic! (I'd be happier if I new where those symbols are coming from in the first place. Copypasting the names of the attachments from html maybe?)
for the patchbot:
apply trac14912-unique_doc.patch trac_14912-comments-dg.patch trac_14912-more-ts-dg.patch
Attachment: trac_14912-more-ts-dg.patch.gz
includes Travis' link tweaks. proofread by Travis
Everything looks good to me, so I'm going to set this to positive review. However feel free to object.
For patchbot:
Apply: trac14912-unique_doc.patch trac_14912-comments-dg.patch trac_14912-more-ts-dg.patch
Changed reviewer from Darij Grinberg to Darij Grinberg, Travis Scrimshaw
Thank you Simon and Darij for your work on this!
sage -t devel/sage/sage/structure/unique_representation.py
**********************************************************************
File "devel/sage/sage/structure/unique_representation.py", line 455, in sage.structure.unique_representation
Failed example:
type(Kp)
Expected:
<class 'sage.rings.finite_rings.finite_field_ext_pari.FiniteField_ext_pari_with_category'>
Got:
<class 'sage.rings.finite_rings.finite_field_pari_ffelt.FiniteField_pari_ffelt_with_category'>
**********************************************************************
File "devel/sage/sage/structure/unique_representation.py", line 814, in sage.structure.unique_representation.CachedRepresentation
Failed example:
n == id(SymmetricGroup(17))
Expected:
False
Got:
True
**********************************************************************
Dependencies: #14888
Sadly not my love, unless someone teaches me enough of the stuff that I understand it myself.
http://www.sagemath.org/doc/reference/structure/sage/structure/unique_representation.html
One thing I don't understand is:
What is meant by "accept as its own input"?
Something that possibly should be added is some explanation of the difference between
__classcall__
and__classcall_private__
. Maybe it doesn't belong there, but then again I have no idea where it belongs as it seems not documented at all...Also there seems to be a typo:
I think it's UniqueFactory, not UniqueRepresentation, which is easy to use etc.
Apply:
Depends on #14888
CC: @simon-king-jena @tscrim
Component: documentation
Keywords: structure
Author: Simon King
Branch: u/vbraun/unique_rep_tutorial
Reviewer: Darij Grinberg, Travis Scrimshaw
Issue created by migration from https://trac.sagemath.org/ticket/14912