sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.44k stars 480 forks source link

test containment of ideals in class MPolynomialIdeal #12802

Closed a9bc4e33-7b98-4180-affc-8dfcef89e22b closed 12 years ago

a9bc4e33-7b98-4180-affc-8dfcef89e22b commented 12 years ago

There seems to be no way to test containment of ideals in the class MPolynomialIdeal in sage.rings.polynomial.multi_polynomial_ideal. One might expect the comparison operators (e.g. I<J ) to do this, but in fact what they do is to compare the Groebner bases as sequences of polynomials, which is counterintuitive. For example:

sage: R.<x,y> = PolynomialRing(QQ)
sage: I=(x*y)*R; J=(x,y)*R; I<J
False
sage: I=(y+1)*R; J=(x,y)*R; I<J
True

This is implemented in the __cmp__ method, which is not up to the task of doing subset comparison, since __cmp__ is only suitable for total orderings.

To do it right would seem to require implementing Python's "rich comparison" methods, __lt__, __gt__, etc.

For example:

from sage.rings.polynomial.multi_polynomial_ideal import MPolynomialIdeal

def IsSubset(I,J):
  for g in I.gens()
    if not g in J: return False
  return True

def IsSuperset(I,J):
  return IsSubset(J,I)

def IsProperSubset(I,J):
  return I!=J and IsSubset(I,J)

def IsProperSuperset(I,J):
  return I!J and IsSuperset(I,J)

setattr(MPolynomialIdeal,'__le__',IsSubset)
setattr(MPolynomialIdeal,'__lt__',IsProperSubset)
setattr(MPolynomialIdeal,'__ge__',IsSuperset)
setattr(MPolynomialIdeal,'__gt__',IsProperSuperset)

With these we now get the expected behavior:

sage: R.<x,y> = PolynomialRing(QQ)
sage: I=(x*y)*R; J=(x,y)*R; I<J
True
sage: I=(y+1)*R; J=(x,y)*R; I<J
False

The patch supplied gives a solution via Groebner bases, and also fixes #12839.

Apply:

  1. attachment: trac_12802_no_whitespace.patch

CC: @malb @nthiery

Component: commutative algebra

Keywords: sd40.5, groebner bases, ideals

Author: John Perry

Reviewer: Andrey Novoseltsev, Simon King

Merged: sage-5.4.rc0

Issue created by migration from https://trac.sagemath.org/ticket/12802

johnperry-math commented 12 years ago

Changed keywords from none to sd40.5, groebner bases, ideals

johnperry-math commented 12 years ago

Description changed:

--- 
+++ 
@@ -48,4 +48,8 @@
 False

+The patch supplied gives a solution via Groebner bases, and also fixes #12839.

+Apply: + +* attachment: trac_12802_and_12839.patch

johnperry-math commented 12 years ago

Author: John Perry

johnperry-math commented 12 years ago
comment:1

Attachment: trac_12802_and_12803.patch.gz

The attached file gives a better check for correctness than even the proposed solution, using Groebner bases rather than generators alone. The examples given by mariah produce a correct result on my machine, though for a doctest I use a slightly different solution which her proposed fix wouldn't detect.

This patch also resolve #12839, as far as I can tell.

novoselt commented 12 years ago

Work Issues: documentation

novoselt commented 12 years ago
comment:2

Functions are missing standard documentation blocks.

johnperry-math commented 12 years ago
comment:3

Replying to @novoselt:

Functions are missing standard documentation blocks.

It's even worse than that. For some reason, I'm now finding regressions that I didn't find earlier. I'm working on an updated patch.

johnperry-math commented 12 years ago
comment:4

Okay, so the problems were real, but were discovered because there's a very weird test of elements of quadratic number fields: basically, if a and b are distinct elements of a quadratic number field, then a>b and b>a are always true:

sage: K.<a> = NumberField(x^2 + 1)
sage: a > a + 1
True

That strikes me as really, really odd behavior.

Anyway, I'm about to upload a new patch that undoes the stupidity I did to sage.rings.ideal.py, adds some documentation to the __cmp__ function in Ideal_generic, then takes care of the things that need taking care of down in sage.rings.polynomial.multi_polynomial_idea.py. I've also added documentation. (Sorry about that -- it didn't occur to me because the stuff I was rewriting wasn't documented, either.) It also removes a boatload of whitespace.

johnperry-math commented 12 years ago

Reviewer: novoselt

johnperry-math commented 12 years ago
comment:6

To help the reviewer(s), here are the non-whitespace changes.

  1. In sage/rings/ideal.py, I added a docstring & doctest to Ideal_generic's __cmp__.
  2. In sage/rings/polynomial/multipolynomial_ideal.py, I a. redefined __cmp__ to rely on __lt__, __gt__, and __eq__; b. changed __eq__ to test whether each ideal is a subset of the other; c. changed __gt__ to test whether other is a subset of self; and d. moved the old code for __eq__ to __lt__, where it now checks whether each polynomial in the Groebner basis of self reduces to 0 modulo the Groebner basis of other. Here, I also had to move a try...except loop around, because of a strange AttributeError related to caching, raised only by the doctests of pbori.pyx.

On my machine all the doctests of sage/rings pass with this patch. I haven't doctested everything, though.

novoselt commented 12 years ago
comment:7

Would be better to have whitespaces separately... Anyway: "Decides whether two ideals are equal. The test is performed on the generators." does not sound right, this method does not decide whether ideals are equal. How about "Compare generators of two ideals." instead?

novoselt commented 12 years ago

Changed reviewer from novoselt to Andrey Novoseltsev

johnperry-math commented 12 years ago
comment:8

Replying to @novoselt:

Would be better to have whitespaces separately...

I can do that, if you think it best. I'd rather not, now that I've removed them, but I can.

Anyway: "Decides whether two ideals are equal. The test is performed on the generators." does not sound right, this method does not decide whether ideals are equal. How about "Compare generators of two ideals." instead?

Done.

novoselt commented 12 years ago
comment:9

Apply trac_12802_and_12839.patch

novoselt commented 12 years ago
comment:10

I don't insist on separating whitespace patch now, it is just inconvenient to dig for changes in 100kb. But please make subsequent changes in a new little patch.

  1. __cmp__ does not return the result of self == other contrary to what is written. I also think that the test should call cmp directly, since == does not always lead to cmp.
  2. In __lt__ the documentation says "We use ideal membership to test whether the generators of each ideal appears in the other." while only one of the containment directions should be checked. Despite of the description of the algorithm in the documentation, the actual code does something else in the beginning and can return numbers or result of cmp instead of True/False.
  3. What will __cmp__ of ideals return if neither of them is contained in another? Do we even need it for ideals if there are lt/gt?
  4. Documentation of __eq__ is wrong.
johnperry-math commented 12 years ago
comment:11

Replying to @novoselt:

I don't insist on separating whitespace patch now, it is just inconvenient to dig for changes in 100kb. But please make subsequent changes in a new little patch.

Okay, new patch coming. It'll be a bit.

  1. __cmp__ does not return the result of self == other contrary to what is written. I also think that the test should call cmp directly, since == does not always lead to cmp.

Got it.

  1. In __lt__ the documentation says "We use ideal membership to test whether the generators of each ideal appears in the other." while only one of the containment directions should be checked. Despite of the description of the algorithm in the documentation, the actual code does something else in the beginning and can return numbers or result of cmp instead of True/False.

This was originally code for __cmp__, and when I first started working on it, I had completely forgotten that __cmp__ doesn't work the same as these other special functions. That was one reason for the regression I encountered earlier.

  1. What will __cmp__ of ideals return if neither of them is contained in another? Do we even need it for ideals if there are lt/gt?

When I first worked on this, the __cmp__ of Ideal_generic was trumping everything else. That's why I redefined __cmp__, and it's also why I originally did some of this in Ideal_generic.

  1. Documentation of __eq__ is wrong.

It took me a while to find what you meant, but I finally saw it.

johnperry-math commented 12 years ago
comment:12

The last line in this if statement that you pointed out:

        if R is not S: # rings are unique
            if type(R) == type(S) and (R.base_ring() == S.base_ring()) and (R.ngens() == S.ngens()):
                other = other.change_ring(R)
            else:
                return cmp((type(R), R.base_ring(), R.ngens()), (type(S), S.base_ring(), S.ngens()))

strikes me as wrong. If the base ring or the number of generators are different, they can't be the same ideal. I'm changing that to return False.

The type is another story. If someone has defined a ring with singular, and wants to compare to a ring defined with cocoa or macaulay, that ought to be possible. Unfortunately, I can't test it: neither Macaulay nor CoCoA build on my Sage 5.0. Are you able to build either?

novoselt commented 12 years ago
comment:13

I think comparison of rings should be done by converting all to a single system where they will be unique. And in general what's the point in ring uniqueness if they may not be unique? If you are not satisfied with R is S check, use R == S and let the meaning of this be handled by ring code, not the ideal comparison.

Note also that there are many failing doctests, the worst ones are with infinite recursion. If it will be necessary to change output in doctests of toric varieties, please do it on top of #13023 which is already positively reviewed.

johnperry-math commented 12 years ago
comment:14

Replying to @novoselt:

I think comparison of rings should be done by converting all to a single system where they will be unique.

This has to be checked, due to different term orders. I've encountered regressions because of this; they may be the cause of some of your failing doctests.

Note also that there are many failing doctests, the worst ones are with infinite recursion.

I think I've fixed this one, too; I think the cause was related to returning numbers. I'm about to upload a new patch.

I wasn't getting any failures in sage/rings, btw, so I guess your failures are elsewhere?

novoselt commented 12 years ago
comment:15

I looked at the patchbot log (accessible through the disk in top right corner of the description).

johnperry-math commented 12 years ago
comment:16

I had plumb forgot about that.

johnperry-math commented 12 years ago
comment:17

At least some of those tests have nothing to do with this patch; I get them even when I pop & rebuild.

Edit: I forgot that there are two patches involved. They may be due to this after all; I'm looking at them now.

The toric_ideals.py regressions are already fixed.

johnperry-math commented 12 years ago
comment:18

Okay, it looks like I have a new patch ready. The problem was that lots of things depended on the old mpolynomial_ideal __cmp__ as it was, and I had inadvertently broken that by tying it to subsets (for less than & greater than) rather than equality and non-equality. Those six tests now pass for me.

Updated patch on its way... I also fixed something in the documentation of ideal's __cmp__.

johnperry-math commented 12 years ago

Description changed:

--- 
+++ 
@@ -1,5 +1,4 @@
-There seems to be no way to test containment of ideals in the class MPolynomialIdeal in `sage.rings.polynomial.multi_polynomial_ideal`.  One might expect the comparison operators (e.g. `I<J` ) to do this, but in fact what they do is to compare the Groebner bases as sequences of polynomials, which is counterintuitive.
-For example:
+There seems to be no way to test containment of ideals in the class MPolynomialIdeal in `sage.rings.polynomial.multi_polynomial_ideal`.  One might expect the comparison operators (e.g. `I<J` ) to do this, but in fact what they do is to compare the Groebner bases as sequences of polynomials, which is counterintuitive. For example:

sage: R.<x,y> = PolynomialRing(QQ) @@ -8,7 +7,6 @@ sage: I=(y+1)R; J=(x,y)R; I<J True

-
 This is implemented in the `__cmp__` method, which is not up to the task of doing subset comparison, since `__cmp__` is only suitable for total orderings.

 To do it right would seem to require implementing Python's "rich comparison" methods, `__lt__`, `__gt__`, etc.
@@ -37,7 +35,6 @@
 setattr(MPolynomialIdeal,'__ge__',IsSuperset)
 setattr(MPolynomialIdeal,'__gt__',IsProperSuperset)

- With these we now get the expected behavior:

@@ -47,9 +44,9 @@
 sage: I=(y+1)*R; J=(x,y)*R; I<J
 False

- The patch supplied gives a solution via Groebner bases, and also fixes #12839.

Apply:

-* attachment: trac_12802_and_12839.patch +1. attachment: trac_12802_and_12839.patch +2. attachment: trac_12802_additional_changes.patch

novoselt commented 12 years ago
comment:20
Rx = PolynomialRing(QQ, 2, "x")
Ix = Rx.ideal(Rx.0)
Ry = PolynomialRing(QQ, 2, "y")
Iy = Ry.ideal(Ry.0)
print Ix, Iy
Ix == Iy

False before and True after. Better not detect equality in some convoluted cases than claim equality for different ideals.

johnperry-math commented 12 years ago
comment:21

Replying to @novoselt:

False before and True after. Better not detect equality in some convoluted cases than claim equality for different ideals.

I agree. This is pretty easily fixed, though; the concern stated is that the orders might differ. I'll put in a narrower test. New patch on its way...

johnperry-math commented 12 years ago
comment:22

I added your example as a test, and another for the orderings.

By the way, I'm sorry the patch is showing so many changes; I don't think there are that many, but somehow it thinks a lot of stuff was removed, then put back in. That actually did happen, but I thought I put things in the same order.

novoselt commented 12 years ago
comment:23

"at this point, the ideals are the same, except perhaps the term order" - you meant that rings are the same, not ideals

There are still two doctest errors according to patchbot.

novoselt commented 12 years ago
comment:24

Oh, and "OUTPUT:" is usually on a separate line separated by a black line from the actual description:

http://sagemath.org/doc/developer/conventions.html#documentation-strings

johnperry-math commented 12 years ago
comment:25

Replying to @novoselt:

There are still two doctest errors according to patchbot.

I caught it too, after I uploaded the patch. New one in a moment.

Replying to @novoselt:

Oh, and "OUTPUT:" is usually on a separate line separated by a black line from the actual description:

http://sagemath.org/doc/developer/conventions.html#documentation-strings

I was following the practice I saw elsewhere in these files (quite a few places), but I'll change it.

johnperry-math commented 12 years ago
comment:26

New patch ready. I've run doctests in sage/rings, as well the ones that had failed earlier, & those all pass. Hopefully the patchbot won't uncover any others.

johnperry-math commented 12 years ago
comment:27

All tests pass on my machine.

simon-king-jena commented 12 years ago
comment:28

Compare #12976: If you work on comparison of ideals, you should perhaps also consider the fact that ideals evaluating as equal may have different hash, because the hash relies on the string representation (hence, on the generators), but the comparison relies on Gröbner bases. So, the hash is broken (and is rather slow, too).

Also, there is one thing you didn't fix: If the Gröbner basis of only one of the two to-be-compared ideals is not in the cache, then a copy of the two ideals is created using degrevlex term order, and then the Gröbner basis is computed in the supposedly easier order. However, the ring change happens even if the ideals already are in degrevlex order, and moreover the result of the Gröbner basis computation is not cached. For that problem, I had already opened #12987.

By the way, good that you do not try to change comparison of polynomial ideals into a generator-wise comparison - the ticket description let the impression arise that you want to drop Gröbner bases.

How should one proceed? Fix the "useless double-computation of Gröbner bases of copies of ideals" here? Then this ticket is "needs work", and #12987 is a duplicate. Or fix it at #12987? Then I have to add this ticket as a new dependency for #12987.

Possible idea: One could create an attribute _gb_by_term_order_ that is a dictionary storing the Gröbner bases of an ideal with respect to different orders. Hence, when doing a comparison and computing the Gröbner basis of a copy of ideal J in "degrevlex" order (while J lives in a ring with different order), then J._gb_by_term_order_["degrevlex"] should store the result.

A related topic: Avoid using comparison and hash when passing argument attributes in singular_function. See #12977 (needs review).

Another thing: This ticket's component should be "commutative algebra", not "algebra". And I guess Martin should be Cc.

novoselt commented 12 years ago

Changed work issues from documentation to cache handling

johnperry-math commented 12 years ago
comment:30

Replying to @simon-king-jena:

Compare #12976: If you work on comparison of ideals, you should perhaps also consider the fact that ideals evaluating as equal may have different hash, because the hash relies on the string representation (hence, on the generators), but the comparison relies on Gröbner bases. So, the hash is broken (and is rather slow, too).

I don't understand the issue with the hash, probably because I don't know anything about how Sage hashes rings, ideals, etc. I can look into this, but: the code ignores the hash altogether, and a Groebner basis is only computed for a copy; the original ideal is untouched. So, how could this patch break the hash?

Also, there is one thing you didn't fix: If the Gröbner basis of only one of the two to-be-compared ideals is not in the cache, then a copy of the two ideals is created using degrevlex term order, and then the Gröbner basis is computed in the supposedly easier order.

When I run the example you provide in #12987, I get this:

sage: %timeit J==J2
5 loops, best of 3: 534 ms per loop
sage: _ = J.groebner_basis()
sage: %timeit J==J2
5 loops, best of 3: 533 ms per loop
sage: _ = J2.groebner_basis()
sage: %timeit J==J2
5 loops, best of 3: 533 ms per loop

So it's even worse than you say; it's not that I didn't fix it; I've actually broken something there. (I don't think this is the hash; correct me if I'm wrong.) That third timeit is definitely a lot slower with this patch installed.

IMHO, rings and ideals ought to be independent of term orderings, which the user ought to be able to switch without creating a new ring & ideal. A Groebner basis of an ideal could be stored in a dictionary for each different term ordering requested. As far as I can tell, this isn't possible now, and I don't know if it's feasible at all, given the reliance on Singular.

But, your proposal:

One could create an attribute _gb_by_term_order_ that is a dictionary storing the Gröbner bases of an ideal with respect to different orders.

...seems similar. Do you think a long-term fix of making rings independent of term orderings would be worth looking at?

Hence, when doing a comparison and computing the Gröbner basis of a copy of ideal J in "degrevlex" order (while J lives in a ring with different order), then J._gb_by_term_order_["degrevlex"] should store the result.

This can, of course, be done now. I will try to work on it over the next few days.

By the way, good that you do not try to change comparison of polynomial ideals into a generator-wise comparison - the ticket description let the impression arise that you want to drop Gröbner bases.

FWIW, I didn't write that part. I wrote the part where I said we would use Groebner bases. ;-)

How should one proceed? Fix the "useless double-computation of Gröbner bases of copies of ideals" here? Then this ticket is "needs work", and #12987 is a duplicate. Or fix it at #12987? Then I have to add this ticket as a new dependency for #12987.

I think it better to mark #12987 as a duplicate in that case, the same way #12839 is a duplicate, and fix this issue here, at least with your proposed approach. I can certainly do that.

Another thing: This ticket's component should be "commutative algebra"...

I agree.

simon-king-jena commented 12 years ago
comment:31

Replying to @johnperry-math:

Replying to @simon-king-jena:

Compare #12976: If you work on comparison of ideals, you should perhaps also consider the fact that ideals evaluating as equal may have different hash, because the hash relies on the string representation (hence, on the generators), but the comparison relies on Gröbner bases. So, the hash is broken (and is rather slow, too).

I don't understand the issue with the hash, probably because I don't know anything about how Sage hashes rings, ideals, etc.

That's how Python uses the hash: Python's dictionaries and sets are hash tables. Hence, whenever you use an ideal J (or any other object) as a key in a dictionary D or put it into a set S, then the hash is computed, and the object is put into a "bucket" that corresponds to the hash value.

If you then have an object G (say, the ideal that is given by the reduced Gröbner basis of J) and want to test whether G is in D or S (say, D[G] or G in S), then the hash of G is computed, and then only the bucket corresponding to the hash of G is searched for an equal object. The buckets typically are very small. Hence, if you have a quick hash function then searching in a set is much faster than searching in a list.

Now, a problem arises if G==J but hash(G)!=hash(J): G and J may end up in different buckets, and thus G in S would return False, even though J in S and J==G.

Summary: It is required by Python that the hash values of equal objects are equal. Otherwise, the hash is called "broken".

I can look into this, but: the code ignores the hash altogether, and a Groebner basis is only computed for a copy; the original ideal is untouched. So, how could this patch break the hash?

I did not claim that the patch breaks the hash. I merely stated that the hash is broken (see #12976), and I asked whether we should extend the patch here so that the hash is fixed, or whether we leave the hash broken for now.

When I run the example you provide in #12987, I get this:

sage: %timeit J==J2
5 loops, best of 3: 534 ms per loop
sage: _ = J.groebner_basis()
sage: %timeit J==J2
5 loops, best of 3: 533 ms per loop
sage: _ = J2.groebner_basis()
sage: %timeit J==J2
5 loops, best of 3: 533 ms per loop

So it's even worse than you say; it's not that I didn't fix it; I've actually broken something there. (I don't think this is the hash; correct me if I'm wrong.)

No, there is no hash involved here.

IMHO, rings and ideals ought to be independent of term orderings, which the user ought to be able to switch without creating a new ring & ideal.

That sounds nice at first glance. However, it would be awkward to ask for providing a term order as a new argument to the groebner_basis() method. And there are many algorithms that rely on Gröbner bases. So, I believe having a default term order for each ring (and thus have different rings for different term orders) is fine, IMHO.

Anyway, ideals that are independent of a default term order could probably be implemented using the "parents with multiple realisations" that have recently been introduced (hence, Cc to Nicolas...). I wonder, though, if that would be possible without a speed regression.

But, your proposal:

One could create an attribute _gb_by_term_order_ that is a dictionary storing the Gröbner bases of an ideal with respect to different orders.

...seems similar. Do you think a long-term fix of making rings independent of term orderings would be worth looking at?

I am not sure, see above.

How should one proceed? Fix the "useless double-computation of Gröbner bases of copies of ideals" here? Then this ticket is "needs work", and #12987 is a duplicate. Or fix it at #12987? Then I have to add this ticket as a new dependency for #12987.

I think it better to mark #12987 as a duplicate in that case, the same way #12839 is a duplicate, and fix this issue here, at least with your proposed approach. I can certainly do that.

OK, I'll comment #12987 accordingly.

johnperry-math commented 12 years ago

Changed reviewer from Andrey Novoseltsev to Andrey Novoseltsev, Simon King

johnperry-math commented 12 years ago
comment:32

Replying to @simon-king-jena:

That's how Python uses the hash: Python's dictionaries and sets are hash tables. Hence, whenever you use an ideal J (or any other object) as a key in a dictionary D or put it into a set S, then the hash is computed, and the object is put into a "bucket" that corresponds to the hash value.

The only part that surprises me is that items with the same hash are put into a bucket. I had learned hashes differently.

Now, a problem arises if G==J but hash(G)!=hash(J): G and J may end up in different buckets, and thus G in S would return False, even though J in S and J==G.

Summary: It is required by Python that the hash values of equal objects are equal. Otherwise, the hash is called "broken".

I see; the hash has been broken a while. I can suggest a quick hash function, actually: compute a Groebner basis of the homogenized ideal up to degree d for some small value of d. (Standard homogenization, if there's any doubt.)

Figuring out a good value of d would be the hard part. If it's fixed at a constant, the worst you'd see is a bunch of ideals in the same bucket, which would force more Groebner basis computations than one would like. Alternately, d could be computed based on the generators, though I'm not sure how to do that quite yet. (max degree + 1? could get very, very expensive for some ideals)

This seems to me the safest way to provide a relatively quick signature for an ideal that ensures equal ideals with different presentations are in the same bucket, at the cost of several distinct ideals sharing the same bucket.

I did not claim that the patch breaks the hash. I merely stated that the hash is broken (see #12976), and I asked whether we should extend the patch here so that the hash is fixed, or whether we leave the hash broken for now.

This strikes me as a significantly different problem, albeit related, deserving of its own ticket.

IMHO, rings and ideals ought to be independent of term orderings, which the user ought to be able to switch without creating a new ring & ideal.

That sounds nice at first glance. However, it would be awkward to ask for providing a term order as a new argument to the groebner_basis() method.

I wasn't thinking of eliminating a a default ordering for a ring that extends automatically to its ideals; that makes sense. I would just like the option to switch the default ordering of a ring or ideal, without having to create a new ring and/or ideal, and retaining the previously-computed Groebner basis, if desired.

This annoyance has actually come up in my work. I suspect a fix would require a serious re-thinking, since so many things rely on Groebner bases, and many have likely been written with the current framework in mind. It may not be feasible in the short term, maybe not even in the long term, but it seems worth discussing.

Anyway, ideals that are independent of a default term order could probably be implemented using the "parents with multiple realisations" that have recently been introduced (hence, Cc to Nicolas...). I wonder, though, if that would be possible without a speed regression.

Can you point me to a thread or ticket?

BTW, I'm adding you as a reviewer.

simon-king-jena commented 12 years ago
comment:33

Replying to @johnperry-math:

The only part that surprises me is that items with the same hash are put into a bucket. I had learned hashes differently.

Isn't that how hash tables work? Sorry, I am not a computer scientist, it could easily be that I misunderstood the internals of dictionaries.

I see; the hash has been broken a while. I can suggest a quick hash function, actually: compute a Groebner basis of the homogenized ideal up to degree d for some small value of d. (Standard homogenization, if there's any doubt.)

Wouldn't that, again, depend on the term order? And how should one choose d? After all, if you compute a Gröbner basis out to degree d but there are further elements in higher degree, then the complete Gröbner basis will have a different hash.

Perhaps one could make ideals non-hashable? After all, what one needs in applications (to my experience) is the tuple of generators of the ideal.

This strikes me as a significantly different problem, albeit related, deserving of its own ticket.

OK.

Anyway, ideals that are independent of a default term order could probably be implemented using the "parents with multiple realisations" that have recently been introduced (hence, Cc to Nicolas...). I wonder, though, if that would be possible without a speed regression.

Can you point me to a thread or ticket?

Sorry, have to catch a bus. I hope I don't forget to answer later. But perhaps Nicolas beats me to it?

johnperry-math commented 12 years ago
comment:34

Replying to @simon-king-jena: I am not a computer scientist, either. I learned that each hash corresponds to one, unique element. When clashes occur, the second element to hit the hash is given a new hash, based on a certain heuristic.

My understanding of what you wrote is that several objects can occupy the same hash value, and when comparisons need to be made, something more computationally expensive is called: __eq__, perhaps. (I wasn't sure exactly which.) That strikes me as what this page is saying, too, though I am not entirely sure.

I see; the hash has been broken a while. I can suggest a quick hash function, actually: compute a Groebner basis of the homogenized ideal up to degree d for some small value of d . (Standard homogenization, if there's any doubt.)

Wouldn't that, again, depend on the term order?

We'd have to pick a "good" term order for the hash, just as we currently pick a "good" term order to test whether ideals are equal, if no Groebner basis currently exists.

And how should one choose d ? After all, if you compute a Gröbner basis out to degree d but there are further elements in higher degree, then the complete Gröbner basis will have a different hash.

Well, yes, I said that, myself. My idea gets pretty bad, pretty quickly: A nice example of how things can go wrong is if I=<x,x5+x+1> and J=<x5+1,x5+x+1>, which are equal (to <1>, in fact). But the homogenized ideals' Groebner bases are different, regardless of degree. If one dehomogenizes, it works out, but this is already getting pretty ugly.

Perhaps one could make ideals non-hashable?

I'm okay with that, too. Are ideals mutable? If so, the answer is easy.

To my thinking, ideals are hashable in principle; just use a Groebner basis wrt to any order at all. It's the computation of the hash that's potentially horrifying. Do you know if Python calls __hash__ when the ideal is created, or when it's actually required? If it's the latter, rather than the former, we could simply compute a hash by computing the degrevlex Groebner basis, and be done with it. No one would encounter the performance penalty until they actually tried to use it.

simon-king-jena commented 12 years ago
comment:35

Replying to @johnperry-math:

My understanding of what you wrote is that several objects can occupy the same hash value, and when comparisons need to be made, something more computationally expensive is called: __eq__, perhaps.

Yes.

Perhaps one could make ideals non-hashable?

I'm okay with that, too. Are ideals mutable? If so, the answer is easy.

I don't think they are mutable.

To my thinking, ideals are hashable in principle; just use a Groebner basis wrt to any order at all. It's the computation of the hash that's potentially horrifying.

Yes, that's exactly the problem. One can compare ideals and can compute a reasonable hash effectively, but probably there is no way to do that efficiently in a way that preserves the mathematical notion of equality of ideals (namely: Equality as a sub-set of a ring).

Do you know if Python calls __hash__ when the ideal is created, or when it's actually required?

No. The hash is only computed when it is needed (hence, when the ideal is put into a set or dictionary). The following toy example should illustrate what is called when:

sage: class Foo:
....:     def __init__(self, n):
....:         self.n = n
....:     def __hash__(self):
....:         print "computing hash for",self.n
....:         return int(self.n%10)
....:     def __cmp__(self, other):
....:         print "comparing",self.n,"with",other.n
....:         return cmp(self.n, other.n)
....:     
sage: a = Foo(5)
sage: b = Foo(6)
sage: c = Foo(15)
sage: a2 = Foo(5)
sage: a == a2
comparing 5 with 5
True
sage: a == b
comparing 5 with 6
False
sage: D = {}
sage: D[a] = 1
computing hash for 5
sage: D[a]
computing hash for 5
1
sage: D[b] = 2
computing hash for 6
sage: D[c] = 5
computing hash for 15
comparing 5 with 15

As you can see in the last line, the hash of c is computed, which I made coincide with the hash of a. So, they are put into the same hash bucket, and thus a comparison happens. But the line before, no comparison happens, because the hash bucket for b is empty when it is added to the dictionary.

In particular, normally the hash is not computed during initialisation. However, due to caching a lot of stuff in the category and coercion framework, it could very well be that the hash is created during __init__. I don't think that is the case for ideals, though.

If it's the latter, rather than the former, we could simply compute a hash by computing the degrevlex Groebner basis, and be done with it. No one would encounter the performance penalty until they actually tried to use it...

... which is not unlikely. Currently, ideals are used in dictionaries, for example in singular_function, when trying to convince Singular that the given generators of some ideal form a Gröbner basis. That's why I try to provide a faster (though syntactically less elegant) way to provide information on the arguments of singular_function (see #12977).

I promised to provide you with a ticket number for multiple realisations: #7980.

johnperry-math commented 12 years ago
comment:36

Replying to @simon-king-jena:

Perhaps one could make ideals non-hashable?

I'm okay with that, too. Are ideals mutable? If so, the answer is easy.

I don't think they are mutable.

I don't think so, either. I don't see any way to add or subtract generators from the ideal. Well, is there a conceivable use for the hash of an ideal? For example, would someone need a set of ideals for something? I'm pretty sure this is a possibility.

(Incidentally, when I wrote, "If so, the answer is easy," I meant, "If not, the answer is easy.")

No. The hash is only computed when it is needed (hence, when the ideal is put into a set or dictionary). The following toy example should illustrate what is called when:

In that case, and if we can agree that someone might want to hash ideals, I suggest we using the degrevlex Groebner basis for the hash, and include a warning in the documentation that using an ideal with anything that requires a hash could slow things down seriously. That way, people have been warned. If this slows down some Sage functions that rely on dictionaries, we could rewrite them to avoid that, as you are doing with #12977.

Personally, I would rather have no hash than an incorrect one, even for immutable objects. That said, it could be a valid long-term project to find a hash for ideals.

I promised to provide you with a ticket number for multiple realisations: #7980.

Thanks. I'll look at it.

johnperry-math commented 12 years ago
comment:37

I'm about to upload a new patch that uses Simon's _gb_by_ordering idea. Here are the timings on the same machine as the other day:

sage: P.<x,y> = QQ[]
<+ 30*y + 25, 1/400*x^4*y^4 - 1/5*x^5*y^2 - 1/20*x^4*y^3 + 4*x^6 + 2*x^5*y + 11/20*x^4*y^2 - 12*x^5 - 3*x^4*y + 9*x^4]*P      
sage: J2 = J.gens()*P
sage: %timeit J==J2
5 loops, best of 3: 376 µs per loop
sage: _ = J.groebner_basis()
sage: %timeit J==J2         
625 loops, best of 3: 372 µs per loop
sage: _ = J2.groebner_basis()
sage: %timeit J==J2
625 loops, best of 3: 362 µs per loop

A more than 1000-fold increase. :-)

All tests in sage/rings pass muster, as do the tests in schemes that had failed the other day.

simon-king-jena commented 12 years ago

Changed work issues from cache handling to fix tests; is the cache pickled?

simon-king-jena commented 12 years ago
comment:38

I can confirm the 1000-fold speed-up. Great!

However, the patch bot finds a few doctest errors. They are clearly related to this patch.

By the way: Could you also provide a test that confirms that '_gb_by_ordering' is preserved when copying/pickling an ideal?

johnperry-math commented 12 years ago
comment:39

Replying to @simon-king-jena:

However, the patch bot finds a few doctest errors. They are clearly related to this patch.

Thanks for pointing them out. They look easy.

By the way: Could you also provide a test that confirms that '_gb_by_ordering' is preserved when copying/pickling an ideal?

I should have an updated patch out by the end of the day

johnperry-math commented 12 years ago
comment:40

New file fixes the previously broken doctests, and adds a pickling test for __gb_by_ordering.

simon-king-jena commented 12 years ago

Changed work issues from fix tests; is the cache pickled? to Patch commit messages

simon-king-jena commented 12 years ago
comment:41

As much as I understand, both patches need proper commit messages (also containing the ticket number).

novoselt commented 12 years ago
comment:42

The second one definitely needs a non-queue message, but there is no need in the ticket number, it will be added automatically during merge.