nthiery commented 12 years ago

Teaser
------

Python handles multiple inheritance by computing, for each class,
a linear extension of all its super classes (the Method Resolution
Order, MRO). The MRO is calculated recursively from local
information (the *ordered* list of the direct super classes), with
the so-called ``C3`` algorithm. This algorithm can fail if the local
information is not consistent; worst, there exist hiearchies of
classes with provably no consistent local information.

For large hierarchy of classes, like those derived from categories in
Sage, maintaining consistent local information by hand does not scale
and leads to unpredictable ``C3`` failures (the dreaded "could not
find a consistent method resolution order"); a maintenance nightmare.

This patch implements a final solution to this problem. Namely, it
allows for building automatically the local information from the bare
class hierarchy in such a way that guarantees that the ``C3``
algorithm will never fail.

Err, but you said that this was provably impossible? Well, not if
one relaxes a bit the hypotheses, but that's not something one
would want to do by hand :-)

Details
-------

Please see the extensive documentation at the top of the file
sage/misc/c3_controlled.py in the attached patch.

Content of the patch
--------------------

- Implement controlled C3 in sage.misc.c3_controlled.

- Implement a total order in Category, and have Category use
  C3 controlled by this order instead of plain C3.

- Tweak the current total order to minimize changes in the order of
  categories.

- Update doctests w.r.t. remaining changes of in the order of
  categories.

- Remove a coupld doctests displaying "all_super_categories" that did not bring useful information to the user nor intesting test, yet needed to be constantly updated; nothing but a good source of conflicts.

- Rewrite doctests in sage.misc.c3 to be independent of categories
  since those do not use anymore this implementation of C3.

- Resolve some ambiguities to make the code more independent of the
  order of categories. In particular, FiniteCoxeterGroups prefer
  __iter__ and some_elements from CoxeterGroups to that of
  FiniteGroups.

- Update the section in the primer about order of categories.

- Provide further tools in ``sage.misc.c3_controlled`` to
  experiment with C3 and friends.

- Extract category_sample from category_graph

Credits
-------

This patch is a followup to a study of the C3 algorithm together with
Florent Hivert, and to discussions with Simon King and his
implementation of C3.

Apply

attachment: trac_13589-categories-c3_under_control-nt.patch

Depends on #12894 Depends on #12876 Depends on #11935 Depends on #12895 Depends on #10193

CC: @sagetrac-sage-combinat @simon-king-jena

Component: categories

Keywords: method resolution order, C3

Author: Nicolas M. Thiéry, Simon King

Reviewer: Simon King, Florent Hivert

Merged: sage-5.12.beta0

Issue created by migration from https://trac.sagemath.org/ticket/13589

nthiery commented 12 years ago

Description changed:

--- 
+++ 
@@ -20,10 +20,24 @@
 one relaxes a bit the hypotheses, but that's not something one
 would want to do by hand :-)

+Details

+---

-Details: please see the extensive documentation at the top of the file sage/misc/c3_controlled.py in the attached patch.
+Please see the extensive documentation at the top of the file sage/misc/c3_controlled.py in the attached patch.

-Status: the patch is functional, but still breaks a couple things here and there. In particular, some doctests need to be updated w.r.t. some changes in MRO and super_categories output. I will fix this as soon as the principle will be accepted.
+Status

-Credits: this patch is a followup to a study of the C3 algorithm together with Florent Hivert, and to discussions with Simon King and his implementation of C3.
+---
+
+The patch is functional, but still breaks a couple things here and there. In particular, some doctests need to be updated w.r.t. some changes in MRO and super_categories output. I will fix this as soon as the principle will be accepted.
+
+The patch is also available on the Sage-Combinat queue (guarded out by
+default by +functorial_construction).
+
+Credits
+
+---
+
+This patch is a followup to a study of the C3 algorithm together with Florent Hivert, and to discussions with Simon King and his implementation of C3.
+

nthiery commented 12 years ago

Changed dependencies from #13501 to #13501, #12894

3f8450e1-87bf-41c6-ab53-29a0552debb3 commented 11 years ago

comment:3

Hello,

Could you update your patch, there is several hunk in the last sage versions (i work without combinat to use NCSF...)

patching file sage/categories/category.py
Hunk #6 FAILED at 1090
Hunk #7 FAILED at 1200
Hunk #9 FAILED at 2003
Hunk #10 FAILED at 2084
4 out of 11 hunks FAILED -- saving rejects to file sage/categories/category.py.rej
abort: patch failed to apply

Furthermore, it could be useful to add that quickly in sage. The graded Hopf algebras seems to be the last bastion before recurrent MRO errors.

Thanks, Jean-Baptiste

simon-king-jena commented 11 years ago

comment:4

I currently have

trac_14159_weak_value_triple_dict.patch
trac_14159_use_cdef_get.patch
trac_13184_sage_5.9.beta.patch
trac_14287-rebased.patch
trac_14217_base_functionality.patch
trac_12876_category_abstract_classes_for_hom.patch
trac11935_weak_pickling_by_construction-nt.patch
trac_11935-weak_pickling_by_construction-review-ts.patch
trac_14249-coercion_without_an_element.patch
trac_12894-classcall_setter-nt.patch
trac_12895-subcategory-methods-nt.patch
trac_12895-review.patch

on top of sage-5.9.rc0 (these all have positive review or are even merged in sage-5.10.beta), and then the patch fails to apply like this:

Füge trac_13589-categories-c3_under_control-nt.patch zur Seriendatei hinzu
Wende trac_13589-categories-c3_under_control-nt.patch an
Wende Patch auf Datei sage/categories/category.py an
FEHLSCHLAG von Teilstück #1 in Zeile 94
Teilstück #6 wurde erfolgreich in Zeile 1105 mit Unschärfe 1 angewandt (16 Zeilen verschoben).
FEHLSCHLAG von Teilstück #7 in Zeile 1289
Teilstück #9 wurde erfolgreich in Zeile 2152 mit Unschärfe 1 angewandt (58 Zeilen verschoben).
2 von 11 Teilstücken sind FEHLGESCHLAGEN -- speichere Ausschuss in Datei sage/categories/category.py.rej
Patch schlug fehl und Fortsetzung unmöglich (versuche -v)
Patch schlug fehl, Fehlerabschnitte noch im Arbeitsverzeichnis
Fehler beim Anwenden. Bitte beheben und trac_13589-categories-c3_under_control-nt.patch aktualisieren

So, there is some improvement with respect to what Jean-Baptiste reports. Nevertheless, it seems that dependencies should be stated, and probably the patch needs rebasing.

simon-king-jena commented 11 years ago

comment:5

In particular, the patch uses some CategoryWithAxiom, which is not defined here or in the given dependencies.

simon-king-jena commented 11 years ago

comment:6

I have not been able to find CategoryWithAxiom or category with axiom on trac.

nthiery commented 11 years ago

comment:7

Yes, this patch still needs a bit of work. It should be ready tuesday or so. You can have a look at the text in the patch where I describe the purpose and principle of the patch, but don't waste time with a more detailed review at this point!

Thanks!

simon-king-jena commented 11 years ago

comment:8

Hi Nicolas,

OK, I thought this is next, after #11935.

Best regards, Simon

nthiery commented 11 years ago

comment:9

12895 was next! And now I have to run behind :-)

Thanks for all your review work! I'll pile up some stuff for you soon and let you know :-)

nthiery commented 11 years ago

Changed dependencies from #13501, #12894 to #13501, #12894, #12895

nthiery commented 11 years ago

comment:11

Hi Simon,

The updated patch should be roughly close to completion. Most if not all tests should pass (they did when I was working on the patch in git; I may have screwed up my export back to mercurial, and/or some dependencies).

I still need to scan once again through the patch to check that everything is 100% doctested, and I also want to reread the explanations in sage.misc.c3_controlled. I'll do that tomorrow. But I think you can start reviewing it in particular checking whether the whole logic makes sense to you. Let me know if/when you start working on it so that we avoid conflicts.

Thanks! Nicolas

nthiery commented 11 years ago

Changed dependencies from #13501, #12894, #12895 to #13501, #12894, #12876, #11935, #12895

nthiery commented 11 years ago

comment:13

For info: I am running the tests now and will report when I wake up.

nthiery commented 11 years ago

comment:14

All long tests passed on my machine with 5.10beta4 and the following patches applied:

trac_14612-gyw_test_speedup-ts.patch
trac_14574-folded.patch
trac_13735_fix_repr_lincomb.patch
trac_14123-binary-trees-maps-rebase-cs.patch
trac_12876_category_abstract_classes_for_hom.patch
trac11935_weak_pickling_by_construction-nt.patch
trac_11935-weak_pickling_by_construction-review-ts.patch
trac_12895-subcategory-methods-nt.patch
trac_13589-categories-c3_under_control-nt.patch

nthiery commented 11 years ago

comment:15

Ok, patchbot is happy too except for coverage in c3_controlled, doctest continuations, startup time and startup modules. The two first ones will be easy fixes. I'll investigate the two others this morning.

nthiery commented 11 years ago

Description changed:

--- 
+++ 
@@ -24,20 +24,48 @@

 ---

-Please see the extensive documentation at the top of the file sage/misc/c3_controlled.py in the attached patch.
+Please see the extensive documentation at the top of the file
+sage/misc/c3_controlled.py in the attached patch.
+
+Content of the patch
+
+---
+
+- Implement controlled C3 in sage.misc.c3_controlled.
+
+- Implement a total order in Category, and have Category use
+  c3 controlled by this order instead of plain c3.
+
+- Tweak the current total order to minimize changes in the order of
+  categories.
+
+- Update doctests w.r.t. remaining changes of in the order of
+  categories.
+
+- Rewrite doctests in sage.misc.c3 to be independent of categories
+  since those do not use anymore this implementation of C3.
+
+- Resolve some ambiguities to make the code more independent of the
+  order of categories. In particular, FiniteCoxeterGroups prefer
+  `__iter__` and some_elements from CoxeterGroups to that of
+  FiniteGroups.
+
+- Update the section in the primer about order of categories.
+
+- Provide further tools in :mod:`sage.misc.c3_controlled` to
+  experiment with C3 and friends.

 Status

 ---

-The patch is functional, but still breaks a couple things here and there. In particular, some doctests need to be updated w.r.t. some changes in MRO and super_categories output. I will fix this as soon as the principle will be accepted.
-
-The patch is also available on the Sage-Combinat queue (guarded out by
-default by +functorial_construction).
+Completely ready for review.

 Credits

 ---

-This patch is a followup to a study of the C3 algorithm together with Florent Hivert, and to discussions with Simon King and his implementation of C3.
+This patch is a followup to a study of the C3 algorithm together with
+Florent Hivert, and to discussions with Simon King and his
+implementation of C3.

nthiery commented 11 years ago

comment:17

Hi Simon,

The patch is now completely ready for review:

I fixed the coverage and continuation issue.
I cythoned sage.misc.c3_controlled; hopefuly this will fix the startup time regression
I went through the whole module, improved the doc and threw away some scories.
I don't know why the bot complains about the non existent modules sage.categories.inspect and itertools. I guess its just confused. As for sage.misc.c3_controlled, well yes, it's new :-)
I am running all long tests, and will report soon.

Thanks for your upcoming review!

Off to work on the main functorial construction patch!

                      Nicolas

nthiery commented 11 years ago

comment:18

Oops, I had forgotten a little improvement I wanted to do in the implementation of the total order. It looks a tiny bit less hacky now and could be a tiny bit faster.

All test pass. Running long tests now.

nthiery commented 11 years ago

comment:19

Gosh, I had fumbled my export and uploaded the wrong patch. Fixed!

nthiery commented 11 years ago

Description changed:

--- 
+++ 
@@ -1,3 +1,8 @@
+
+```rst
+Teaser
+------
+
 Python handles multiple inheritance by computing, for each class,
 a linear extension of all its super classes (the Method Resolution
 Order, MRO). The MRO is calculated recursively from local
@@ -21,20 +26,18 @@
 would want to do by hand :-)

 Details
-
----
+-------

 Please see the extensive documentation at the top of the file
 sage/misc/c3_controlled.py in the attached patch.

 Content of the patch
-
----
+--------------------

 - Implement controlled C3 in sage.misc.c3_controlled.

 - Implement a total order in Category, and have Category use
-  c3 controlled by this order instead of plain c3.
+  C3 controlled by this order instead of plain C3.

 - Tweak the current total order to minimize changes in the order of
   categories.
@@ -47,25 +50,19 @@

 - Resolve some ambiguities to make the code more independent of the
   order of categories. In particular, FiniteCoxeterGroups prefer
-  `__iter__` and some_elements from CoxeterGroups to that of
+  __iter__ and some_elements from CoxeterGroups to that of
   FiniteGroups.

 - Update the section in the primer about order of categories.

-- Provide further tools in :mod:`sage.misc.c3_controlled` to
+- Provide further tools in ``sage.misc.c3_controlled`` to
   experiment with C3 and friends.

-Status
-
----
-
-Completely ready for review.
-
 Credits
-
----
+-------

 This patch is a followup to a study of the C3 algorithm together with
 Florent Hivert, and to discussions with Simon King and his
 implementation of C3.
+```

nthiery commented 11 years ago

comment:21

Shoot, the Cythonisation has broken one longtest failure in sage.misc.c3_controlled. I am investigating this; the rest can be reviewed in the mean time.

The cythonization has not improved the startup time. It's not yet clear to me what can be causing the slower startup time. To be investigated ...

simon-king-jena commented 11 years ago

comment:22

Some random remarks:

Why is Category._cmp_key a cached method and not a lazy attribute?
Why is CategoryWithParameters._cmp_key a method and not a lazy attribute or at least a cached method?
Why has this example
```
sage: Groups().example().algebra(ZZ).categories()
...
```
been completely removed from sage/categories/groups.py? Similarly sage: Modules(Integers(9)).all_super_categories() from sage/categories/modules.py etc.

In the documentation of primer.py:

However this must be considered as an *implementation detail*: `C_1` 
and `C_2` are incomparable categories, then the order in which they 
appear must be mathematically irrelevant:

I guess there is "If" missing after the first colon.

In sage/misc/c3_controlled.pyx, line 123: Should be "classes that an object inherits from.", not "classes that an object inherit from."
... line 139, "However, this has several inconvenients:" I guess this should be "However, this has several drawbacks" or "However, this is inconvenient in several regards" or so.

Do I understand correctly: As you outline in lines 148-166, the creation of classes will become slower (O(n^3) instead of O(n^2) for getting the MRO, etc) if one explicitly puts the desired MRO into a long list of bases. This would certainly be a reason for an increased startup time and other regressions. Therefore, in a first step, you compute short lists of bases that ensure that the C3 algorithm still reconstruct the intended MRO. However, is this additional step (namely: Computing lists of bases) takes some time, not affecting the startup time?

I still have to read the actual code (and not just the documentation). One question, though, which is in the spirit of #11935. In sage.categories.category, you have

        (result, bases) = C3_sorted_merge([cat._all_super_categories 
                                           for cat in self._super_categories] + 
                                          [self._super_categories], 
                                          key = attrcall('_cmp_key')) 
        self._super_categories_for_classes = bases 
        return [self] + result

I guess in many cases the result will be the same up to the base rings. Shouldn't we think of a way to avoid duplication of work? I could imagine that here is the reason for the startup time regression.

nthiery commented 11 years ago

comment:23

Replying to @simon-king-jena:

Some random remarks:

Why is Category._cmp_key a cached method and not a lazy attribute?

Why is CategoryWithParameters._cmp_key a method and not a lazy attribute or at least a cached method?

To be discussed. If I remember correctly, it's slightly easier to use super in cached methods than attributes, but I guess both would work.

Why has this example
sage: Groups().example().algebra(ZZ).categories()
...
been completely removed from sage/categories/groups.py? Similarly sage: Modules(Integers(9)).all_super_categories() from sage/categories/modules.py etc.

Because they neither bring useful test or information, and need to be updated whenever the category order changes which is a good source of conflicts.

In the documentation of primer.py:
However this must be considered as an *implementation detail*: `C_1` 
and `C_2` are incomparable categories, then the order in which they 
appear must be mathematically irrelevant:
I guess there is "If" missing after the first colon.
In sage/misc/c3_controlled.pyx, line 123: Should be "classes that an object inherits from.", not "classes that an object inherit from."

... line 139, "However, this has several inconvenients:" I guess this should be "However, this has several drawbacks" or "However, this is inconvenient in several regards" or so.

Thanks, I'll fix that! Probably on Monday, together with the fix for the failing long test.

Do I understand correctly: As you outline in lines 148-166, the creation of classes will become slower (O(n^3) instead of O(n^2) for getting the MRO, etc) if one explicitly puts the desired MRO into a long list of bases. This would certainly be a reason for an increased startup time and other regressions. Therefore, in a first step, you compute short lists of bases that ensure that the C3 algorithm still reconstruct the intended MRO. However, is this additional step (namely: Computing lists of bases) takes some time, not affecting the startup time?

Please read further :-) That would be O(n^3) if one was brute forcing the complete mro in the list of bases. Luckily it's more clever than that! New bases are only added when absolutely necessary; in fact, in the current situation it turns out that no base is actually added even for non trivial categories like Fields or GradedHopfAlgebrasWithBasis.

I still have to read the actual code (and not just the documentation). One question, though, which is in the spirit of #11935. In sage.categories.category, you have
        (result, bases) = C3_sorted_merge([cat._all_super_categories 
                                           for cat in self._super_categories] + 
                                          [self._super_categories], 
                                          key = attrcall('_cmp_key')) 
        self._super_categories_for_classes = bases 
        return [self] + result 
I guess in many cases the result will be the same up to the base rings. Shouldn't we think of a way to avoid duplication of work? I could imagine that here is the reason for the startup time regression.

This code is only called if all_super_categories is called. And by

11935 this should happen only once for all base ring in the same

category. (unless one calls explicitly all_super_categories). I have not kept the timings under hand, but I did not see a difference in our usual elliptic curves benchmark.

Cheers, Nicolas

simon-king-jena commented 11 years ago

comment:24

For the record, the only failing example is this:

File "devel/sage/sage/misc/c3_controlled.pyx", line 266, in sage.misc.c3_controlled
Failed example:
    for l in L:                                          # long time
        x = HierarchyElement(10, l.to_poset())
        assert x.mro            == list(P)
        assert x.mro_controlled == list(P)
        assert x.all_bases_len() == 15
        assert x.all_bases_controlled_len() == 19
        try:
            x.mro_standard
            assert False
        except:
            pass
Exception raised:
    Traceback (most recent call last):
      File "/home/simon/SAGE/prerelease/sage-5.9.rc0/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 466, in _run
        self.execute(example, compiled, test.globs)
      File "/home/simon/SAGE/prerelease/sage-5.9.rc0/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 825, in execute
        exec compiled in globs
      File "<doctest sage.misc.c3_controlled[35]>", line 6, in <module>
        assert x.all_bases_controlled_len() == Integer(19)
    AssertionError

Indeed, on the command line, I get

sage: for l in L:
....:     print "l =",l
....:     x = HierarchyElement(10, l.to_poset())
....:     print x.all_bases_controlled_len()
....:     
l = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
19
l = [10, 9, 8, 7, 6, 5, 4, 3, 1, 2]
19
l = [10, 9, 8, 7, 6, 5, 4, 2, 3, 1]
19
l = [10, 9, 8, 7, 6, 5, 4, 2, 1, 3]
19
l = [10, 9, 8, 7, 6, 5, 4, 1, 3, 2]
19
l = [10, 9, 8, 7, 6, 5, 4, 1, 2, 3]
19
l = [10, 9, 8, 7, 6, 5, 3, 4, 2, 1]
18
...
l = [10, 7, 9, 8, 5, 6, 4, 1, 3, 2]
20
l = [10, 7, 9, 8, 5, 6, 4, 1, 2, 3]
20
l = [10, 7, 9, 8, 5, 6, 3, 4, 2, 1]
18
l = [10, 7, 9, 8, 5, 6, 3, 4, 1, 2]
18
...
l = [10, 7, 9, 8, 4, 5, 6, 1, 3, 2]
20
l = [10, 7, 9, 8, 4, 5, 6, 1, 2, 3]
20
l = [10, 7, 9, 8, 4, 5, 1, 6, 3, 2]
17
l = [10, 7, 9, 8, 4, 5, 1, 6, 2, 3]
17
l = [10, 7, 9, 6, 8, 5, 4, 3, 2, 1]
19
...
l = [10, 7, 9, 6, 8, 5, 4, 1, 2, 3]
19
l = [10, 7, 9, 6, 8, 5, 3, 4, 2, 1]
16
l = [10, 7, 9, 6, 8, 5, 3, 4, 1, 2]
16
l = [10, 7, 9, 6, 8, 4, 5, 3, 2, 1]
18
l = [10, 7, 9, 6, 8, 4, 5, 3, 1, 2]
18
l = [10, 7, 9, 6, 8, 4, 5, 2, 3, 1]
18
l = [10, 7, 9, 6, 8, 4, 5, 2, 1, 3]
18
l = [10, 7, 9, 6, 8, 4, 5, 1, 3, 2]
18
l = [10, 7, 9, 6, 8, 4, 5, 1, 2, 3]
18
l = [10, 7, 9, 6, 8, 4, 2, 5, 3, 1]
17
l = [10, 7, 9, 6, 8, 4, 2, 5, 1, 3]
17
l = [10, 7, 9, 6, 4, 8, 5, 3, 2, 1]
19
l = [10, 7, 9, 6, 4, 8, 5, 3, 1, 2]
19
...
l = [10, 7, 4, 8, 9, 6, 5, 1, 2, 3]
19
l = [10, 7, 4, 8, 9, 6, 2, 5, 3, 1]
16
l = [10, 7, 4, 8, 9, 6, 2, 5, 1, 3]
16
l = [10, 7, 4, 8, 9, 5, 6, 3, 2, 1]
18
l = [10, 7, 4, 8, 9, 5, 6, 3, 1, 2]
18
...
l = [10, 7, 4, 8, 5, 9, 6, 1, 2, 3]
17
l = [10, 7, 4, 8, 5, 9, 1, 6, 3, 2]
16
l = [10, 7, 4, 8, 5, 9, 1, 6, 2, 3]
16
sage:

I am not surprise that some posets are easier to control than others. Why do you expect that all_bases_controlled_len is the same in all cases?

nthiery commented 11 years ago

comment:25

Replying to @simon-king-jena:

I am not surprise that some posets are easier to control than others. Why do you expect that all_bases_controlled_len is the same in all cases?

The fact that the number of bases to be added does not depend on the linear extension is certainly specific to this poset. But before cythonisation this used to be the case. So I need to investigate what went wrong!

nthiery commented 11 years ago

comment:26

Hi Simon!

It turns out that I had just fooled myself because of a typo in the test. Even for this example, the number of bases to be added does depend on the linear extension.

So all is good, the python/cython implementations agree.

The updated patch:

fixes the typos you mentionned
reworks a bit the text to make it clearer that the code implements an optimized "add bases" trick which does not have the drawbacks of the brute force approach.
fixes the incorrect doctest, and gather some stats on the number of bases to be added for each linear extension
mentions the removed doctests, and the rationale for removing them, in the patch header

There remains to decide between a lazy attribute or a cached method for _cmp_key. Any idea on how to investigate the startup time welcome.

Cheers, Nicolas

nthiery commented 11 years ago

Description changed:

--- 
+++ 
@@ -45,6 +45,8 @@
 - Update doctests w.r.t. remaining changes of in the order of
   categories.

+- Remove a coupld doctests displaying "all_super_categories" that did not bring useful information to the user nor intesting test, yet needed to be constantly updated; nothing but a good source of conflicts.
+
 - Rewrite doctests in sage.misc.c3 to be independent of categories
   since those do not use anymore this implementation of C3.

nthiery commented 11 years ago

comment:28

Hi Simon,

While playing with larger hierarchy of classes for the functorial construction patch, I stumbled on one execution path which was not treated correctly. I'll post an updated patch shortly.

nthiery commented 11 years ago

Attachment: c3-fix-nt.patch.gz

nthiery commented 11 years ago

comment:29

Replying to @nthiery:

While playing with larger hierarchy of classes for the functorial construction patch, I stumbled on one execution path which was not treated correctly. I'll post an updated patch shortly.

Ok, the updated patch includes the (hopefuly) now correct implementation together with relevant tests. At this occasion, I declared a couple more variables for cython and added some debugging code (commented out by default).

You can look at :attachment:c3-fix-nt.patch if you just want to see the changes.

I guess last time I wrote such a long function was when I played around with F5! It would be a good candidate for a computer assisted proof of correctness or for automatic test generation.

nthiery commented 11 years ago

comment:30

I forgot to mention: all long tests pass on my machine.

simon-king-jena commented 11 years ago

comment:31

Replying to @nthiery:

I guess last time I wrote such a long function was when I played around with F5! It would be a good candidate for a computer assisted proof of correctness or for automatic test generation.

Do we have those things (I mean "computer assisted correctness proofs", not "F5") in Sage?

I am travelling this week. So, I will probably not be able to finish the review right now.

nthiery commented 11 years ago

comment:32

Do we have those things (I mean "computer assisted correctness proofs", not "F5") in Sage?

Nope. But we have experts in Orsay in the office next to ours :-)

I am travelling this week. So, I will probably not be able to finish the review right now.

Ok.

nthiery commented 11 years ago

Changed dependencies from #13501, #12894, #12876, #11935, #12895 to #12894, #12876, #11935, #12895

nthiery commented 11 years ago

comment:34

Apply attachment: trac_13589-categories-c3_under_control-nt.patch

nthiery commented 11 years ago

comment:35

Attachment: trac_13589-categories-c3_under_control-category_sample-nt.patch.gz

The updated patch includes a method category_sample which saves a couple lines and which I needed anyway later on. attachment: trac_13589-categories-c3_under_control-category_sample-nt.patch shows the diff.

nthiery commented 11 years ago

comment:36

Arr, I can't wait until we have a more semantic way to specify which patches to apply; this is way too error prone to trivial syntax errors ...

Apply: attachment: trac_13589-categories-c3_under_control-nt.patch

nthiery commented 11 years ago

Description changed:

--- 
+++ 
@@ -60,6 +60,10 @@
 - Provide further tools in ``sage.misc.c3_controlled`` to
   experiment with C3 and friends.

+- Extract category_sample from category_graph
+
+Apply: [attachment:trac_13589-categories-c3_under_control-nt.patch]
+
 Credits
 -------

nthiery commented 11 years ago

Description changed:

--- 
+++ 
@@ -62,7 +62,7 @@

 - Extract category_sample from category_graph

-Apply: [attachment:trac_13589-categories-c3_under_control-nt.patch]
+Apply: trac_13589-categories-c3_under_control-nt.patch

 Credits
 -------

simon-king-jena commented 11 years ago

Changed dependencies from #12894, #12876, #11935, #12895 to #12894, #12876, #11935, #12895, #10193

simon-king-jena commented 11 years ago

comment:40

Apply: trac_13589-categories-c3_under_control-nt.patch

simon-king-jena commented 11 years ago

comment:41

Nicolas and I just discussed: _cmp_key should at least be a cached method, not a plain method, so that it plays nicely with super(...). However, we may try whether lazy attributes would work, because Nicolas calls super(...) only to compute the value, but the value that should eventually be used does not depend on the class.

simon-king-jena commented 11 years ago

comment:42

This is a minimal example of what Nicolas wants to do:

sage: class A(object):
    @lazy_attribute
    def x(self):
        print "computing the attribute with A"
        return 1
....:     
sage: class B(A):
    @lazy_attribute
    def x(self):
        print "this is lazy attribute with B"
        r= super(B,self).x
        self.y = r
        return r
....:     
sage: b = B()
sage: b.x
this is lazy attribute with B
computing the attribute with A
1
sage: b.y
1

So, it seems to work with lazy attribute, and this will be much faster than calling a method (repeatedly). Trying to change it now.

simon-king-jena commented 11 years ago

For speed reasons, make _cmp_key a lazy attribute, not a (cached) method

simon-king-jena commented 11 years ago

Description changed:

--- 
+++ 
@@ -62,7 +62,11 @@

 - Extract category_sample from category_graph

-Apply: trac_13589-categories-c3_under_control-nt.patch
+Apply
+-----
+
+- trac_13589-categories-c3_under_control-nt.patch
+- trac13589_cmp_key_attribute.patch

 Credits
 -------

simon-king-jena commented 11 years ago

comment:43

Attachment: trac13589_cmp_key_attribute.patch.gz

Apply: trac_13589-categories-c3_under_control-nt.patch trac13589_cmp_key_attribute.patch

simon-king-jena commented 11 years ago

comment:44

To me, the code looks fine. Patchbot does not report any errors. However, it reports a significant increase of 2.5% of startup time.

How can this be analysed?

simon-king-jena commented 11 years ago

comment:45

Something findings:

C3_sorted_merge was called 121 times with the current patch during startup.
C3_sorted_merge is called on different Groupoids, which should not happen, because all groupoids have the same super categories. Solution: Make it CategoryWithParameters. Then, C3_sorted_merge is only called 93 times during startup.
There are some optimizations possible in C3_sorted_merge:

sage: L1 = Fields().all_super_categories()
sage: L2 = Algebras(QQ).all_super_categories()
sage: cython("""
def test1(list L):
    cdef list out = L[::-1]
def test2(list L):
    cdef list out = list(reversed(L))
""")
....: 
sage: %timeit test1(L1)
1000000 loops, best of 3: 541 ns per loop
sage: %timeit test2(L1)
100000 loops, best of 3: 2.25 us per loop

and

sage: cython("""
....: def test1(list L):
....:     cdef set S = set(x for x in L)
....: def test2(list L):
....:     cdef set S = set([x for x in L])
....: """)
....: 
sage: %timeit test1(L1)
100000 loops, best of 3: 3.91 us per loop
sage: %timeit test2(L1)
100000 loops, best of 3: 3.99 us per loop
sage: %timeit test1(L1)
100000 loops, best of 3: 3.89 us per loop
sage: %timeit test1(L1)
100000 loops, best of 3: 3.89 us per loop
sage: %timeit test2(L1)
100000 loops, best of 3: 4.06 us per loop

In both examples above, test2 is what is currently done in C3_sorted_merge, and test1 is apparently what should be done. I am preparing a patch now.

simon-king-jena commented 11 years ago

comment:46

Another observation: key(O) is called repeatedly, even though its result is already stored in O_key, so, it should be used consistently.

simon-king-jena commented 11 years ago

Description changed:

--- 
+++ 
@@ -67,6 +67,7 @@

 - trac_13589-categories-c3_under_control-nt.patch
 - trac13589_cmp_key_attribute.patch
+- trac13589_improve_startuptime.patch

 Credits
 -------

sagemath / sage

Controlling C3 to solve once for all the Method Resolution Order issues for category classes #13589

12895 was next! And now I have to run behind :-)

11935 this should happen only once for all base ring in the same