Faster enumeration of points for toric varieties

vbraun commented 10 years ago

This ticket implements a specialized enumerator for points of a toric variety over a finite field. It uses linear algebra to solve the rescaling relations. It also parallelizes the enumeration of points for subvarieties of toric varieties.

For example:

$ cat quintic.sage 
p = 3
for k in range(1, 5):
    F = GF(p^k, 'a')
    P4.<z0,z1,z2,z3,z4> = toric_varieties.P(4, base_ring=F)
    X = P4.subscheme(z0^5+z1^5+z2^5+z3^5+z4^5-5*z0*z1*z2*z3*z4)
    print k, X.point_set().cardinality()

$ time sage quintic.sage 
1 36
2 816
3 20691
4 688800

real 14m37.378s
user 136m37.989s
sys 0m18.268s

CC: @novoselt @jpflori @sagetrac-ursula

Component: algebraic geometry

Author: Volker Braun

Branch: 49a7241

Reviewer: Ursula Whitcher

Issue created by migration from https://trac.sagemath.org/ticket/16953

vbraun commented 10 years ago

Branch: u/vbraun/faster_enumeration

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

`0a15c30`	`Specialized point enumerator for finite fields`
`6f6951d`	`speed up cardinality over finite field`
`266d921`	`Parallel enumerator for algebraic subvarieties`

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Commit: 266d921

vbraun commented 10 years ago

Description changed:

--- 
+++ 
@@ -1 +1 @@
-This ticket implements a specialized enumerator for points of a toric variety over a finite field. It uses linear algebra to solve the rescaling relations.
+This ticket implements a specialized enumerator for points of a toric variety over a finite field. It uses linear algebra to solve the rescaling relations. It also parallelizes the enumeration of points for subvarieties of toric varieties.

vbraun commented 10 years ago

Description changed:

--- 
+++ 
@@ -1 +1,23 @@
-This ticket implements a specialized enumerator for points of a toric variety over a finite field. It uses linear algebra to solve the rescaling relations. It also parallelizes the enumeration of points for subvarieties of toric varieties.
+This ticket implements a specialized enumerator for points of a toric variety over a finite field. It uses linear algebra to solve the rescaling relations. It also parallelizes the enumeration of points for subvarieties of toric varieties. 
+
+For example:
+
+```
+$ cat quintic.sage 
+p = 3
+for k in range(1, 5):
+    F = GF(p^k, 'a')
+    P4.<z0,z1,z2,z3,z4> = toric_varieties.P(4, base_ring=F)
+    X = P4.subscheme(z0^5+z1^5+z2^5+z3^5+z4^5-5*z0*z1*z2*z3*z4)
+    print k, X.point_set().cardinality()
+
+$ time sage quintic.sage 
+1 36
+2 816
+3 20691
+4 688800
+
+real   14m37.378s
+user   136m37.989s
+sys    0m18.268s
+```

cd756ef5-7187-4c75-ba26-b9b3d6a11074 commented 9 years ago

comment:5

Asking for the cardinality of a toric variety raises a Not Implemented Error:

P4.<z0,z1,z2,z3,z4> = toric_varieties.P(4)
11
P4.cardinality()
12
Error in lines 2-2
Traceback (most recent call last):
  File "/projects/fbc7bfa1-2c3e-4abc-b8d5-3a0b959a311d/.sagemathcloud/sage_server.py", line 865, in execute
    exec compile(block+'\n', '', 'single') in namespace, locals
  File "", line 1, in <module>
  File "/projects/fbc7bfa1-2c3e-4abc-b8d5-3a0b959a311d/sage-6.3-x86_64-Linux/local/lib/python2.7/site-packages/sage/categories/sets_cat.py", line 1322, in cardinality
    raise NotImplementedError("unknown cardinality")
NotImplementedError: unknown cardinality

Should P4.cardinality() have the same output as P4.point_set().cardinality()?

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 266d921 to fcfc680

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

fcfc680 support the cardinality method from the category stuff

vbraun commented 9 years ago

comment:7

The cardinality method is just inherited from the category framework. I've hooked it up to do the right thing.

cd756ef5-7187-4c75-ba26-b9b3d6a11074 commented 9 years ago

comment:8

When I install this ticket with Sage 6.7, I get the following error on build:

Traceback (most recent call last): File "setup.py", line 559, in run_cythonize() File "setup.py", line 551, in run_cythonize 'profile': profile, File "/projects/fbc7bfa1-2c3e-4abc-b8d5-3a0b959a311d/sage-6.7-x86_64-Linux/local/lib/python2.7/site-packages/Cython-0.22-py2.7-linux-x86_64.egg/Cython/Build/Dependencies.py", line 754, in cytho nize aliases=aliases) File "/projects/fbc7bfa1-2c3e-4abc-b8d5-3a0b959a311d/sage-6.7-x86_64-Linux/local/lib/python2.7/site-packages/Cython-0.22-py2.7-linux-x86_64.egg/Cython/Build/Dependencies.py", line 649, in creat e_extension_list for file in nonempty(extended_iglob(filepattern), "'%s' doesn't match any files" % filepattern): File "/projects/fbc7bfa1-2c3e-4abc-b8d5-3a0b959a311d/sage-6.7-x86_64-Linux/local/lib/python2.7/site-packages/Cython-0.22-py2.7-linux-x86_64.egg/Cython/Build/Dependencies.py", line 103, in nonem pty raise ValueError(error_msg) ValueError: 'sage/tests/parallel.pyx' doesn't match any files

cd756ef5-7187-4c75-ba26-b9b3d6a11074 commented 9 years ago

comment:9

The actual functionality works fine, albeit slowly. I would find it helpful if there were a way to save partial progress when computing a point set: I'm interested in examples that are large enough that the Sage cloud may kill the process before I'm done.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from fcfc680 to 49a7241

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`126240b`	`Merge 6.8.beta0 into #16953`
`49a7241`	`Remove the cardinality doctests`

vbraun commented 9 years ago

comment:11

I fixed the merge conflict. You probably need to run "make" when building with the patch.

videlec commented 9 years ago

comment:12

Hello,

Some quick comments (I will not be able to check most of the mathematical contents of the code).

Some cached method are really not needed. The multiplicative generator of finite field is already cached. So you could remove the FiniteFieldPointEnumerator.multiplicative_generator. And the multiplicative order of the multiplicative generator is q-1 (where q is the cardinality of the field), isn't it?

Instead of using CartesianProduct, use itertools.product (actually this is indirectely used). CartesianProduct is about to disappear.

If you want to copy a list, the fastest is by far zero[:] and not copy(zero).

sage: timeit("copy(zero)")
625 loops, best of 3: 2.59 µs per loop
sage: timeit("zero[:]")
625 loops, best of 3: 248 ns per loop

The fastest way for zero test is X.is_zero() and not X == 0. The latter might involves coercion.

sage: K = GF(53**3, 'a')
sage: b = K('a')+3*K('a')**2+12
sage: timeit("b == 0")
625 loops, best of 3: 2.17 µs per loop
sage: timeit("b.is_zero()")
625 loops, best of 3: 317 ns per loop

Vincent

PS: the hook for modifying the category code was very funny! Hopefully I erase the Sets.ParentMethods.cardinality before this ticket gets in ;-)

vbraun commented 9 years ago

comment:13

IMHO copy(zero) is much more readable that zero[:]. And X.is_zero() doesn't work for all numeric datatypes. Both are premature optimizations for something that isn't even cythonized.

Also, there isn't even a patch that deprecates CartesianProduct. You are getting ahead of yourself.

cd756ef5-7187-4c75-ba26-b9b3d6a11074 commented 9 years ago

comment:14

I got this working with Sage 6.8; it behaves as advertised.

videlec commented 9 years ago

comment:15

Replying to @vbraun:

IMHO copy(zero) is much more readable that zero[:]. And X.is_zero() doesn't work for all numeric datatypes. Both are premature optimizations for something that isn't even cythonized.

And you forgot that copy(X) needs from copy import copy which is very readable. Doing [zero]*n each time would even be faster...

Also, there isn't even a patch that deprecates CartesianProduct. You are getting ahead of yourself.

Really? what about #18411? Citation:

sage.combinat.cartesian_product.CartesianProduct is deprecated.

And these were not my only remarks...

vbraun commented 9 years ago

comment:16

Replying to @videlec:

Replying to @vbraun:

IMHO copy(zero) is much more readable that zero[:].ething that isn't even cythonized.

And you forgot that copy(X) needs from copy import copy which is very readable

I don't see how mixing up function bodies with import statements is ever going to yield a viable argument. In any case, I stand by my opinion that copy(zero) is much more readable that zero[:].

Really? what about #18411?

There is no code on that ticket, and even if there were it depends on unfinished tickets. Its also unclear to me if it is ever going to be merged as some people have previously expressed a strong opinion on sage-devel for something that is named after cartesian product instead of itertools.product. In any case, this ticket long predates #18411 and is ready to be merged right now.

videlec commented 9 years ago

comment:17

Replying to @vbraun:

Replying to @videlec:

Replying to @vbraun:

IMHO copy(zero) is much more readable that zero[:].ething that isn't even cythonized.

And you forgot that copy(X) needs from copy import copy which is very readable

I don't see how mixing up function bodies with import statements is ever going to yield a viable argument. In any case, I stand by my opinion that copy(zero) is much more readable that zero[:].

Really? what about #18411?

There is no code on that ticket, and even if there were it depends on unfinished tickets. Its also unclear to me if it is ever going to be merged as some people have previously expressed a strong opinion on sage-devel for something that is named after cartesian product instead of itertools.product. In any case, this ticket long predates #18411 and is ready to be merged right now.

Then there is cartesian_product_iterator.

What about all your useless cached_method?

vbraun commented 9 years ago

comment:18

Replying to @videlec:

Then there is cartesian_product_iterator.

How about you discuss what to do with cartesian products on #18411?

What about all your useless cached_method?

They give names to quantities. Its true that you can replace the multiplicative order with q-1 everywhere, but it is not true that q-1 in a formula always refers to the order. Giving names to quantities makes code more readable as it explains the "why". And I'm using the general pattern of caching frequently-accessed methods, an extra cached_method has no performance impact but a forgotten cached_method can be very bad. Also, why should you have to read the finite fields implementation just to decide whether to cache something in this module?

videlec commented 9 years ago

comment:20

Replying to @vbraun:

Replying to @videlec:

What about all your useless cached_method?

They give names to quantities. Its true that you can replace the multiplicative order with q-1 everywhere, but it is not true that q-1 in a formula always refers to the order. Giving names to quantities makes code more readable as it explains the "why".

What annoys me is that you add 20 lines of code + documentation + a non documented function for something which can be replaced by self.ring.cardinality() - 1 and self.ring.multiplicative_generator() for which the names are not at all mysterious.

And I'm using the general pattern of caching frequently-accessed methods, an extra cached_method has no performance impact but a forgotten cached_method can be very bad.

cached_method can also have negative impact on performances (for not that frequently-accessed methods). Creating dictionaries is costly, object creation is costly, function call is costly.

Also, why should you have to read the finite fields implementation just to decide whether to cache something in this module?

Hopefully I did it for you. It is clear that multiplicative_generator is something critical in many application and it looks natural to cache this method at the level of finite fields. And as a finite field is mostly defined by its cardinality, you can expect the method cardinality to be fast.

Vincent

vbraun commented 9 years ago

comment:21

Replying to @videlec:

What annoys me is that you add 20 lines of code + documentation + a non documented function for something which can be replaced by self.ring.cardinality() - 1

Let me copy&paste what I sad before: Its true that you can replace the multiplicative order with q-1 everywhere, but it is not true that q-1 in a formula always refers to the order

cached_method can also have negative impact on performances (for not that frequently-accessed methods). Creating dictionaries is costly, object creation is costly, function call is costly.

Benchmark or it is simply not true. There is no point in wasting effort on O(1) things. In fact it saves the .ring dictionary lookup but I didn't want to point that out since I consider that a stupid argument.

Also, why should you have to read the finite fields implementation just to decide whether to cache something in this module?

Hopefully I did it for you.

You are missing the point, it is for the benefit of somebody reading the code in the future.

cd756ef5-7187-4c75-ba26-b9b3d6a11074 commented 9 years ago

comment:22

As the person most interested in the functionality of this ticket, let me reiterate that the ability to save partial progress would be far more important to me than optimization around the edges.

vbraun commented 9 years ago

Reviewer: Ursula Whitcher

vbraun commented 9 years ago

comment:24

I hear you, and I agree that checkpointing is an important feature. But its not really clear how that should be implemented or how the interface would look like. I guess the iterator is (or can be made) pickleable, but really checkpointing needs to be added on the @parallel level. In the short run its probably easier to email William to request a longer timeout on your project...

vbraun commented 9 years ago

Changed branch from u/vbraun/faster_enumeration to 49a7241

fchapoton commented 9 years ago

Description changed:

--- 
+++ 
@@ -17,7 +17,7 @@
 3 20691
 4 688800

-real   14m37.378s
-user   136m37.989s
-sys    0m18.268s
+real 14m37.378s
+user 136m37.989s
+sys 0m18.268s

fchapoton commented 9 years ago

Changed commit from 49a7241 to none

sagemath / sage

Faster enumeration of points for toric varieties #16953