sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.32k stars 451 forks source link

Cache assumptions and only send to Maxima when needed #23138

Open 94a1bc6f-6005-414c-9598-64422d48cc95 opened 7 years ago

94a1bc6f-6005-414c-9598-64422d48cc95 commented 7 years ago

As described in https://groups.google.com/forum/#!topic/sage-devel/jN6inWPyElM, assume() takes more and more time the bigger the assumptions() data base is. This causes a lot of slow-downs when e.g. declaring variables with a domain argument. Nils Bruin suggested that this is due to excessive interactions with the Maxima library and Ralf Stephan suggested that the assumptions could be cached and only sent to Maxima when needed, to speed up the process.

Depends on #23325

CC: @rwst @egourgoulhon

Component: performance

Keywords: Maxima, symbolics, assume

Author: Ralf Stephan

Branch/Commit: u/rws/cache_assumptions_and_only_send_to_maxima_when_needed @ 10c31a1

Issue created by migration from https://trac.sagemath.org/ticket/23138

rwst commented 7 years ago
comment:1

Thanks. The Author field is for the developer, you're the Reporter as you can see top left.

rwst commented 7 years ago

Changed author from schymans to none

rwst commented 7 years ago
comment:3

The "more persistent domains() database" exists already in part as GiNaC/Pynac info flags that are set in parallel to Maxima's assumptions. They can be queried with ex.is_real() etc...What is not saved in Pynac are less elementary assumptions like x>1, y+z==pi. Now instead of caching all assumptions in a database (either Python or C++) and sending to Maxima on demand in bulk, another possibility could be, as you say, to just remove the assume calls on variable creation because they are all elementary assumptions. Then when Maxima needs them for integration, solving etc take the information from Pynac and do assumes for just those variables that are needed. Am I missing something?

rwst commented 7 years ago

Description changed:

--- 
+++ 
@@ -1 +1 @@
-As described in https://groups.google.com/forum/#!topic/sage-devel/jN6inWPyElM, `assume()` takes more and more time the bigger the `assumptions()` data base is. This causes a lot of slow-downs when e.g. declaring variables with a `domain` argument. Nils Bruin suggested that this is due to excessive interactions with the Maxima library and Ralf Stefan suggested that the assumptions could be cached and only sent to Maxima when needed, to speed up the process.
+As described in https://groups.google.com/forum/#!topic/sage-devel/jN6inWPyElM, `assume()` takes more and more time the bigger the `assumptions()` data base is. This causes a lot of slow-downs when e.g. declaring variables with a `domain` argument. Nils Bruin suggested that this is due to excessive interactions with the Maxima library and Ralf Stephan suggested that the assumptions could be cached and only sent to Maxima when needed, to speed up the process.
94a1bc6f-6005-414c-9598-64422d48cc95 commented 7 years ago
comment:4

@rwst, that would be awesome. My current workaround is not to pass the domain information to var() at all, but just save it in a separate assumptions database, in case it is needed at some point. Unfortunately, this also prevents the domain information from being passed to GiNaC/Pynac. Removing the assume() calls during variable creation would be a neater way of going about this. The problem that assume() takes longer and longer the more assumptions have already been passed, could then be approached independently. Should I try to remove the assume() calls, run the doctests and try to create a patch, or could someone else? Sorry about my ignorance regarding the development processes for SageMath.

rwst commented 7 years ago
comment:5

Replying to @sagetrac-schymans:

...Should I try to remove the assume() calls, run the doctests and try to create a patch, or could someone else?

Only removing the calls will make all doctests fail that rely on different variable domains than complex with operations that use Maxima, like integration and solving equations. So additional code is needed. I'll look into it. Of course you can change it in your local copy if you don't need these operation. However it *is possible that other things break.

94a1bc6f-6005-414c-9598-64422d48cc95 commented 7 years ago
comment:6

I thought that in most doctests relying on other variable domains in maxima those would be passed as assume() anyway. I haven't really seen the var(..., domain=...) in use. Is there a way to search all doctests for domain=?

rwst commented 7 years ago
comment:7

Replying to @sagetrac-schymans:

I thought that in most doctests relying on other variable domains in maxima those would be passed as assume() anyway. I haven't really seen the var(..., domain=...) in use. Is there a way to search all doctests for domain=?

ralf@ark:~/sage> grep --recursive -l 'sage: .*var(.*domain=' src/sage/ |grep 'py$\|pyx$'
src/sage/misc/functional.py
src/sage/symbolic/ring.pyx
src/sage/symbolic/assumptions.py
src/sage/symbolic/expression.pyx
src/sage/tensor/modules/free_module_tensor.py
src/sage/tensor/modules/free_module_alt_form.py
src/sage/tensor/modules/comp.py
src/sage/geometry/riemannian_manifolds/parametrized_surface3d.py
src/sage/functions/other.py
src/sage/functions/hyperbolic.py
src/sage/functions/log.py
src/sage/functions/trig.py
src/sage/calculus/wester.py
src/sage/calculus/var.pyx
94a1bc6f-6005-414c-9598-64422d48cc95 commented 7 years ago
comment:8

OK, the doctests do indeed use domain= in var() instead of assume() on many occasions, so this would not work. It is a bit strange, though, to populate the assumptions() data base through the var() call, as assumptions() can be modified and deleted at any time, whereas the GiNaC domain setting is persistent. I think it would be cleaner if all code passed to maxima was accompanied by its own assumptions, which would be partly derived from the GiNaC variable properties and partly user-defined. Would this require changing every single method that calls maxima?

rwst commented 7 years ago
comment:9

Replying to @rwst:

Replying to @sagetrac-schymans:

...Should I try to remove the assume() calls, run the doctests and try to create a patch, or could someone else?

Okay, it's not as simple as that.

rwst commented 7 years ago

Branch: u/rws/cache_assumptions_and_only_send_to_maxima_when_needed

rwst commented 7 years ago

Dependencies: pynac-0.7.9

rwst commented 7 years ago

Author: Ralf Stephan

rwst commented 7 years ago
comment:11

The first commit prevents calls to Maxima, so it should result in a speedup. There is however a bug in Pynac (fixed in 0.7.9) that prevents it from working correctly. With the fix a few doctests fail, so this needs the planned injection of variable domains (EDITED).


New commits:

10c31a123138: don't call Maxima with new symbols
rwst commented 7 years ago

Commit: 10c31a1

94a1bc6f-6005-414c-9598-64422d48cc95 commented 7 years ago
comment:12

Following on from https://groups.google.com/forum/#!topic/sage-devel/-A8ZzSKvYsA, assume() sends assumptions both to maxima and Pynac. In addition, assume() tests if the assumptions are consistent, which is probably not needed when we define a new variable. So, instead of calling assume() in var() maybe we should just replace it by the same as is done in assume() (https://github.com/sagemath/sage-prod/blob/master/src/sage/symbolic/expression.pyx#L1785), i.e. maxima.assume().

Therefore, instead of removing send_sage_domain_to_maxima as in the patch, I would propose to substitute maxima.assume() for all assume() calls in send_sage_domain_to_maxima() (https://github.com/sagemath/sage-prod/blob/master/src/sage/symbolic/ring.pyx#1017). Did I miss something?

Replying to @rwst:

The first commit prevents calls to Maxima, so it should result in a speedup. There is however a bug in Pynac (fixed in 0.7.9) that prevents it from working correctly. With the fix a few doctests fail, so this needs the planned injection of variable domains (EDITED).


New commits:

10c31a123138: don't call Maxima with new symbols
rwst commented 7 years ago
comment:13

Replying to @sagetrac-schymans:

Therefore, instead of removing send_sage_domain_to_maxima as in the patch, I would propose to substitute maxima.assume() for all assume() calls in send_sage_domain_to_maxima() (https://github.com/sagemath/sage-prod/blob/master/src/sage/symbolic/ring.pyx#1017). Did I miss something?

What takes the time with inconsistency checking is maxima.assume() so there would be no difference.

egourgoulhon commented 7 years ago
comment:14

Replying to @rwst:

Replying to @sagetrac-schymans:

I thought that in most doctests relying on other variable domains in maxima those would be passed as assume() anyway. I haven't really seen the var(..., domain=...) in use. Is there a way to search all doctests for domain=?

Just to mention that var(..., domain='real', ...) is used in the code (not doctest part) for manifolds (specifically real manifolds); see line 1449 of src/sage/manifolds/chart.py.

rwst commented 7 years ago

Changed dependencies from pynac-0.7.9 to #23325

rwst commented 7 years ago
comment:16

Setting to review in order to get a patchbot assessment, now that the abovementioned Pynac fix is in develop. Please set back afterwards.

rwst commented 7 years ago
comment:17

See patchbot log.