sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.47k stars 487 forks source link

rewrite conway polynomial spkg and code in Sage library to not use ZODB #12205

Closed williamstein closed 11 years ago

williamstein commented 12 years ago

This is necessary so we can dump the ZODB spkg from Sage.


Installation Instructions

Depends on #13896 Depends on #13963

CC: @kini

Component: packages: huge

Author: R. Andrew Ohana

Reviewer: François Bissey

Merged: sage-5.7.beta0

Issue created by migration from https://trac.sagemath.org/ticket/12205

williamstein commented 12 years ago
comment:2

I did some work on this on the airplane, and better post a link before I forget about it or loose it: http://wstein.org/patches/conway_polynomials-0.3.spkg

Basically, the database is tiny so it can easily and very efficiently just be stored as a standard sobj. Using ZODB is silly.

kiwifb commented 12 years ago
comment:3

That's a good start. Now we need to put SPKG.txt and spkg-install in proper shape. I suppose there need to be more work sage side to use the sobj rather than zodb.

ohanar commented 12 years ago
comment:4

William, do you have a work-in-progress patch for the sage library? I would like to finish this task.

Edit: Other than a description in the SPKG.txt, this spkg should final.

Edit2: Ok, there is now a small description, and it is rebased off of the spkg that I made for #13123.

kiwifb commented 11 years ago
comment:5

Looks like I missed your work on this ohanar getting rid of zodb would be good news all around. Do this need any rebasing?

simon-king-jena commented 11 years ago
comment:6

What's the status? According to comment:4, conway_polynomials-0.4.spkg should be final. So, needs review?

What should be done to test whether this is really enough to get rid of zodb? I mean: How can one remove zodb from an existing Sage install?

ohanar commented 11 years ago
comment:7

Well, a patch is also needed to the sage library, and unfortunately William never provided that. It probably wouldn't be that hard to reverse engineer what he was thinking from the SPKG, but I haven't really looked into it.

The only other database in sage that used zodb was the Cremona database, but I rewrote that a year and a half ago to use sqlite3, so this is the only thing that should really be using zope at all (there may be other things that stupidly depend on it, but they should be easy to fix to not use zope).

kiwifb commented 11 years ago
comment:8

I think all that needs to be changed are the files: databases/compressed_storage.py, databases/db.py and databases/conway.py. conway.py is where we will have to be careful to present the same interface to the rest of sage so has not to break anything.

kiwifb commented 11 years ago
comment:9

OK rewrite of conway.py belongs here and the other two files belong to #10353. I am looking at conway.py and see if it is in my abilities.

kiwifb commented 11 years ago
comment:10

I think it is easy but it will be question of getting the syntax right. In the mean time I found that the stein-watkins optional "huge" database also use zodb and will need a similar treatment.

kiwifb commented 11 years ago
comment:11

In the end I am hitting a brick wall here. Possibly due to my lack of python skills and my will to only have minimal changes in conway.py. The issue is that the created sobj is a double dictionary and that the call to the ConwayPolynomials() class follow the same pattern as a double dictionary. You have calls of the kind: ConwayPolynomials()[3][21]

Implementing the class as a dictionary allows sage to start

import os

from sage.structure.sage_object import load
from sage.misc.misc import SAGE_DATA

_CONWAYDATA = os.path.join(SAGE_DATA,'conway_polynomials')

class ConwayPolynomials(dict):
    def __init__(self,*arg,**kw):
        """
        Initialize the database.
        """
        super(ConwayPolynomials, self).__init__(*arg, **kw)

    def _init(self):
        if not os.path.exists(_CONWAYDATA):
            raise RuntimeError, "In order to initialize the database, the directory must exist."%_CONWAYDATA
        os.chdir(_CONWAYDATA)
        self = load(os.path.join(_CONWAYDATA, 'conway_polynomials'))

but unfortunately but I get key errors:

sage: sage.databases.conway.ConwayPolynomials()[3][21]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)

/home/fbissey/Work/Overlays/BlueFern/sci-mathematics/sage/<ipython console> in <module>()

KeyError: 3
sage: sage.databases.conway.ConwayPolynomials().keys()
[]

I was hoping that inheriting the dict class the access problem would be solved but I am obviously doing something wrong here. Loading the sobj directly from sage is totally usable, it is just when trying to put it in a class.

8d15854a-f726-4f6b-88e7-82ec1970fbba commented 11 years ago
comment:12

Replying to @kiwifb:

self = load(os.path.join(_CONWAYDATA, 'conway_polynomials'))

This only changes the meaning of the self variable in the local scope. To modify the self object, use self.update(load(…)) instead.

kiwifb commented 11 years ago
comment:13

Interesting point but here it will call update() from the dict class which I don't think is what you meant. I still get the same results after doing your suggested replacement.

nbruin commented 11 years ago
comment:14

Replying to @kiwifb:

In the end I am hitting a brick wall here. Possibly due to my lack of python skills and my will to only have minimal changes in conway.py.

I don't think _init is a magic python method that automatically gets called. It may well be part of a protocol involved in sage.databases.db.Database. However, when you want to implement this as inheriting from dict, you'll have to call _init yourself in __init__, or better: just make _init part of it.

To make the dictionary read-only, you may want to override the dictionary modification methods too. Plus you probably want to amend pickling (interesting question: How should this pickle? Should it write its content to a file or should it pickle to "initialize ConwayPolynomials() for unpickling"?

kiwifb commented 11 years ago
comment:15

Thanks Nils. I actually though about that _init method as it was strange to have two. Merging the two didn't improve my problems however. I wouldn't know about the pickling. I'd like this to give me the correct behaviour of the class first. I'll retry merging inits as there may have been caching getting in the way.

kiwifb commented 11 years ago
comment:16

I found a way to properly initialise the database. It is a bit ugly but we may be able to move to the other problems: pickling and dict method override. I'll post a patch when I am on something else than that iPad.

nbruin commented 11 years ago
comment:17

Actually, what you should do is just write the ConwayPolynomials class without thinking about how to initialize from an sobj at all, but just as either a normal class that has an attribute storing the info or inheriting from dict and ensure that pickling works. Then you just have to make sure that the .sobj is a pickled version of it with the appropriate content. (and you may store in the documentation somewhere a script that can produce this pickle from a human-readable file)

ohanar commented 11 years ago
comment:18

Rather than inheriting from dict, I think it makes more sense to inherit from collections.Mapping since we only want read functionality. I uploaded a patch that does this, and uses the current dictionary format in the background (I don't see a good alternative to this storage method, but I'm open to suggestions).

kiwifb commented 11 years ago
comment:19

I think we have to stick with the dictionary format since the code in sage is littered with calls to ConwayPolynomials()[p][n] and changing the storage would imply changing an enormous amount of code. I trust that your python foo is way ahead of mine. I would never have produced that getitem and didn't find any examples to do that with google. Your patch is longer than my current one but probably better behaved in the long run. It is very educative to me too.

nbruin commented 11 years ago
comment:20

Looks great! Two questions:

kiwifb commented 11 years ago
comment:21

I can answer the first question: It is generated in the spkg attached in this ticket from a python table in text format look it up. Technically it doesn't have to be a sobj the previous spkg was installing a bzipped version of the python table and then inserting it in the database. We could do something similar, not sure about the cost.

simon-king-jena commented 11 years ago
comment:22

Replying to @kiwifb:

I think we have to stick with the dictionary format ...

As much as I understand, the "format" (in the sense of the "interface to retrieve data") does not change. That's exactly the point: If you wanted to inherit from dict, then you'd also have to have a __setitem__ method - but we don't want to set data in an interactive session. All what we want is that during initialisation, data are obtained from a file, and then the user can retrieve these data (which is granted by the __getitem__ method).

... since the code in sage is littered with calls to ConwayPolynomials()[p][n]

Would that be changed by the patch? Looking at the lines

    def __getitem__(self, key): 
        try: 
            return self._store[key] 
        except KeyError, err: 
            try: 
                if isinstance(key, (tuple, list)): 
                    if len(key) == 2: 
                        return self._store[key[0]][key[1]] 
            except KeyError: 
                pass 
            raise err 

it seems to me that ConwayPolynomials()[p][n] is still a possible syntax (that's return self._store[key]), but in addition to the old syntax we now also allow the syntax ConwayPolynomials()[p, n] (that's return self._store[key[0]][key[1]]).

and changing the storage would imply changing an enormous amount of code.

By "storage", I understand the format of the data stored in the .sobj file containing the data base. I don't think that would be a problem, as long as there is a function that is able to read old data and transforms it into the new format.

I trust that your python foo is way ahead of mine. I would never have produced that getitem and didn't find any examples to do that with google.

That's in dive into python or other Python introductions, or in the Python documentation.

ohanar commented 11 years ago
comment:23

I noticed a a flaw with what I did -- ConwayPolynomials()[p] returns a mutable Python dict, so you can actually write to the database. I'll post a fix for this in a moment.

There is no good reason as far as I can tell to use anything but plain text in the SPKG considering how small the database is.

As for the second question, the only concern would be that the entire database is loaded into memory with each instance (granted this might not be a huge concern due to the small size of the database). The main reason I added the inheritance of UniqueRepresentation was because of a comment I seem to recall being told that Sage databases should be unique. It is simple enough to remove if we deem it not necessary.

kiwifb commented 11 years ago
comment:24

@simon-king-jena: overall you read my terrible prose correctly. I read the patch carefuly and in spite of call changes in the class I understood that the old way of accessing the class was preserved. You make a good point that it doesn't need to be preserved internally - just that we provide the right method of access. I used to have a copy of "dive into python" wonder what happened to it. Probably lost last time I moved. I would say I read the python documentation but my python skill are not advanced enough that I could have crafted a getitem accessing something like [p][n] - [p,n] is easy the former is more tricky to me.

@ohanar: going with plain text would remove the requirement of installing sage before this spkg. As it is sage needs to be installed first to produce the sobj. The spkg could then be installed at any time.

ohanar commented 11 years ago

Description changed:

--- 
+++ 
@@ -1 +1,8 @@
 This is necessary so we can dump the ZODB spkg from Sage.
+
+---
+**Installation Instructions**
+* Install the [new spkg](http://wstein.org/home/ohanar/spkgs/conway_polynomials-0.4.spkg).
+* Apply [trac12205.patch](https://github.com/sagemath/sage-prod/files/10654375/trac12205.patch.gz) to the sage library
+* Apply [trac12205_root.patch](https://github.com/sagemath/sage/files/ticket12205/trac12205_root.patch.gz) to the sage root repository
+
kiwifb commented 11 years ago
comment:26

Any reason you changed "[]" to "()" in the polynomial method doctest? It wasn't in the previous patch and I would hope that if it is correct it doesn't lead to other breakages in the rest of sage.

nbruin commented 11 years ago
comment:27

Replying to @ohanar:

As for the second question, the only concern would be that the entire database is loaded into memory with each instance (granted this might not be a huge concern due to the small size of the database).

Perhaps store the database in a module-level variable then? Modules naturally have a unique instance (or at least you have to work very hard to get multiple instances). If the nature of the instance is known at compile time (such as with a data base!) you might as well use the mechanism to your advantage.

(for data bases in general: Especially for writable data bases, you wouldn't want multiple instances backed by the same file)

ohanar commented 11 years ago
comment:28

Replying to @kiwifb:

Any reason you changed "[]" to "()" in the polynomial method doctest? It wasn't in the previous patch and I would hope that if it is correct it doesn't lead to other breakages in the rest of sage.

I updated the database to use tuples (updated the spkg), rather than lists, as they are not mutable. I don't believe there are other breakages, but I haven't run the full doctest suit (just a few files I found through greping).

Replying to @nbruin:

Perhaps store the database in a module-level variable then?

Yes, that sounds like a plan.

kiwifb commented 11 years ago
comment:29

Replying to @ohanar:

Replying to @kiwifb:

Any reason you changed "[]" to "()" in the polynomial method doctest? It wasn't in the previous patch and I would hope that if it is correct it doesn't lead to other breakages in the rest of sage.

I updated the database to use tuples (updated the spkg), rather than lists, as they are not mutable. I don't believe there are other breakages, but I haven't run the full doctest suit (just a few files I found through greping).

OK will run some tests as well. I didn't know you had updated the spkg.

ohanar commented 11 years ago
comment:30

Replying to @kiwifb:

OK will run some tests as well. I didn't know you had updated the spkg.

Sorry about that, I noticed it wouldn't install on 5.6-beta1, so I updated it (and changed to tuples at the same time).

Ok, added a updated the patch to add some doctests and remove UniqueRepresentation in favor of a module level variable.

kiwifb commented 11 years ago
comment:31

All tests past on 5.5 with the next to last patch. Since the the last changes are internal I don't think it will change the tests results. Thanks ohanar for the splendid work.

ohanar commented 11 years ago

Author: R. Andrew Ohana

ohanar commented 11 years ago
comment:32

Ok, all doctests are now filled and all functionality should be there, so I'm marking this as needs review.

ohanar commented 11 years ago

apply to sage library

ohanar commented 11 years ago
comment:33

Attachment: trac12205.patch.gz

well, apparently sage doesn't like docstrings surrounded by ''' instead of """, so that is now fixed

kiwifb commented 11 years ago
comment:34

This is very bizzare I had no doctests failures with the previous version of the patch (on top of 5.5) but with this new one I get

sage -t -long "devel/sage-main/sage/algebras/free_algebra.py"
The doctested process was killed by signal 11
         [1.4 s]

But if I add "-verbose" the test pass. There are no obvious relation with this ticket (and #10353 on which I am working at the same time) apart from the fact that it appeared out of the blue after application.

simon-king-jena commented 11 years ago
comment:35

Replying to @kiwifb:

This is very bizzare I had no doctests failures with the previous version of the patch (on top of 5.5) but with this new one I get

sage -t -long "devel/sage-main/sage/algebras/free_algebra.py"
The doctested process was killed by signal 11
         [1.4 s]

But if I add "-verbose" the test pass.

That is very likely caused by an upstream bug in Cython that became immanent with #715.

Can you try with the new Cython spkg from #13896? I.e., install that spkg and do sage -ba.

kiwifb commented 11 years ago
comment:36

Thanks Simon. I have no time to do it tonight, I'll try tomorrow. As far as I am concerned this is the only thing that stand against a positive review. But I suspect it is as you say, and is just something bitting from another corner.

kiwifb commented 11 years ago

Reviewer: François Bissey

kiwifb commented 11 years ago
comment:37

Yup. Upgrading cython fixed the problem. I also checked that the thing would build properly not just upgrade from an existing sage. I am giving this a positive review.

simon-king-jena commented 11 years ago

Dependencies: #13896

simon-king-jena commented 11 years ago
comment:38

Replying to @kiwifb:

Yup. Upgrading cython fixed the problem. I also checked that the thing would build properly not just upgrade from an existing sage. I am giving this a positive review.

Great! To be on the safe side, I add #13896 as a dependency, even though #13896 is a blocker with positive review (hence, is assumed to be in the next release).

ohanar commented 11 years ago
comment:40

What is preventing this from being merged in 5.6?

jdemeyer commented 11 years ago
comment:41

Nothing really, but I feel like this should be merged while removing ZODB.

kiwifb commented 11 years ago
comment:42

I am not fussy about it. I think #10353 is ready but if you want it in two stages that's fine. Zodb is now gone from sage-on-gentoo and maintenance wise we won't miss it :).

jdemeyer commented 11 years ago
comment:43

The sage-root patch needs to be rebased to sage-5.6.beta3.

kiwifb commented 11 years ago

Attachment: trac12205_root.2.patch.gz

updated sage_root repo patch

kiwifb commented 11 years ago

Description changed:

--- 
+++ 
@@ -4,5 +4,5 @@
 **Installation Instructions**
 * Install the [new spkg](http://wstein.org/home/ohanar/spkgs/conway_polynomials-0.4.spkg).
 * Apply [trac12205.patch](https://github.com/sagemath/sage-prod/files/10654375/trac12205.patch.gz) to the sage library
-* Apply [trac12205_root.patch](https://github.com/sagemath/sage/files/ticket12205/trac12205_root.patch.gz) to the sage root repository
+* Apply [trac12205_root.patch](https://github.com/sagemath/sage-prod/files/10654376/trac12205_root.2.patch.gz) to the sage root repository
kiwifb commented 11 years ago
comment:44

I put a rebased version of the root repo patch and updated the ticket instructions accordingly. The rebasing was done against 5.6.rc0.

kiwifb commented 11 years ago
comment:45

Putting back to positive review.

jdemeyer commented 11 years ago

Description changed:

--- 
+++ 
@@ -4,5 +4,5 @@
 **Installation Instructions**
 * Install the [new spkg](http://wstein.org/home/ohanar/spkgs/conway_polynomials-0.4.spkg).
 * Apply [trac12205.patch](https://github.com/sagemath/sage-prod/files/10654375/trac12205.patch.gz) to the sage library
-* Apply [trac12205_root.patch](https://github.com/sagemath/sage-prod/files/10654376/trac12205_root.2.patch.gz) to the sage root repository
+* Apply [attachment: trac12205_root.2.patch](https://github.com/sagemath/sage-prod/files/10654376/trac12205_root.2.patch.gz) to the sage root repository