sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.32k stars 451 forks source link

Implement multidimensional numerical integration #15492

Open 06bbf81d-f646-4d20-904a-9cca4deefa37 opened 10 years ago

06bbf81d-f646-4d20-904a-9cca4deefa37 commented 10 years ago

The patch coming with this ticket is a multidimensional numerical integrator, which uses calls to the Cuba library (http://www.feynarts.de/cuba/), or in case of 1-dim integrals, it defaults to the GSL numerical_integral.

The attached code also has an option to try to reduce the dimensionality of the integral by first trying a (multithreaded) symbolic evaluation.

To make the integral evaluation fast, the code first makes an honest effort to compile the integrand (either using _fast_float_ or cython, depending on the input).

The code can output verbose information on all intermediate steps (including the resulting cython code to be compiled; results from the symbolic preprocessing, etc.), which tries to make the evaluation as transparent as possible. This should allow both for debugging, as well as for optimizing the input integrand in slow-to-converge cases.

For most cases I tested it, this implementation is faster than Mathematica's NIntegrate thanks to Cuba and the included symbolic preprocessor.

Note that the patch requires the Cuba library to be installed for Sage. So, I am attaching Cuba as a spkg package with this ticket as well. That spkg package needs to be installed first before building sage with the patch.

CC: @novoselt

Component: numerical

Keywords: numerical integration, cuba

Branch/Commit: u/rws/implement_multidimensional_numerical_integration @ a9b78d7

Issue created by migration from https://trac.sagemath.org/ticket/15492

06bbf81d-f646-4d20-904a-9cca4deefa37 commented 10 years ago

Description changed:

--- 
+++ 
@@ -4,10 +4,10 @@

 To make the integral evaluation fast, the code first makes an honest effort to compile the integrand (either using `_fast_float_` or cython, depending on the input).

-The code can output verbose information on all intermediate steps (including the resulting cython code to be compiled; results from the symbolic preprocessing, etc.), which tries to make the evaluation as transparent as possible. This should allow both for debugging, as well as for optimizing the input integrand is slow-to-converge cases.
+The code can output verbose information on all intermediate steps (including the resulting cython code to be compiled; results from the symbolic preprocessing, etc.), which tries to make the evaluation as transparent as possible. This should allow both for debugging, as well as for optimizing the input integrand in slow-to-converge cases.

 For most cases I tested it, this implementation is faster than Mathematica's NIntegrate thanks to Cuba and the included symbolic preprocessor.

-Note that the patch requires the Cuba library to be installed for Sage. So, I am sending Cuba as a spkg package in a separate ticket. That spkg package needs to be installed first before building sage with this patch.
+Note that the patch requires the Cuba library to be installed for Sage. So, I am attaching Cuba as a spkg package with this ticket as well. That spkg package needs to be installed first before building sage with the patch.
kcrisman commented 10 years ago
comment:2

Wow, this is interesting - and under active development. Here are annoying things that have nothing to do with the correctness of your code that nonetheless should be brought up.

vbraun commented 10 years ago
comment:3

LGPL is fine.

If you want to touch cuba's build system at all, it would be nice to transition to libtool / automake. There shouldn't be any hand-crafted makefile.in, it should be autogenerated from Makefile.am. Its basically impossible to generate shared libaries by hand. I'm pretty sure your makefile.in won't work on OSX and Solaris. But for starters, you tould just use the cuba static library (static libraries suck, but we could leave enabling shared libraries for the future...)

The code should go into sage.libs.cuba, either a separate directory or just sage/libs/cuba.pyx. Don't add a copy of the cuba header to the Sage library, include the installed header (see make install).

The Sage library code is supposed to make at least an attempt to keep lines at <= 80 chars. Its acceptable to have longer lines, but only if it is necessary to improve readability. E.g. comments and docstrings very rarely have any excuse to be longer than that. In emacs reflowing a paragraph of text is Meta-q (Alt + 'q'), e.g.

Also, PEP8 spacing would be nice.

Line breaks in the doctest should use the new syntax ....::

    sage: foo(123,
    ....:     456)

The multidim_integration function is 1000+ lines long. It really needs to be refactored into subroutines of reasonable size.

Use SR.symbol() for temporary variables whose name you don't care about.

The whole functionality to compile a symbolic expression into C code is nice but unrelated to cuba. It should be moved elsewhere (probably somewhere in sage.ext). It is also sufficiently independed as a task that it could be a separate ticket, you could first implement cuba with fast_float and later add the C/cython callable as an optimization.

06bbf81d-f646-4d20-904a-9cca4deefa37 commented 10 years ago

updated to install cuba.h

06bbf81d-f646-4d20-904a-9cca4deefa37 commented 10 years ago
comment:4

Attachment: cuba-3.2p0.spkg.gz

Hello all,

Thank you so much for your prompt comments! I hope I address them at least so-so below. I updated the attached patch and spkg files accordingly.

As sort of an excuse, I wanted to say that actually this is my first python (and Sage) code ever, so bear with me ... Here it goes point by point.

Replying to kcrisman's comments:

-- All I can say is that Wikipedia says it is, and a cursory look at the text confirms it ... for me.

Replying to vbraun's comments:

  1. Now, about modularity. Yes, it would be nice if compiling into C can be made automatic for any task (for people who do not want to bother with cython), but I don't think I am up for that task. Note that the cython compilation part of the patch specifically invokes multidim_integration_unit_cube. So as written, it is in fact very Cuba specific. In principle, that call can be replaced to a call to any other functional (in a similar way I did the function substitution using a dummy global function), but it already starts getting too contrived. Maybe at some point I'd think about extending that part of the code into a separate utility, but it will not happen now. However, I agree that that piece of the code can be moved to a separate function within the cuba lib. So, I did that, but again, only within the cuba lib. Also, note that of the original 1000+ lines of multidim_integration, 600 are in fact documentation; and the majority of the remaining 400 lines are simply sanity checks and printing debugging information, which should be pretty self-explanatory ... I'd hope... But ok, I still split the code :)

  2. Nathann Cohen suggested that I should add .rst documentation. I tried adding that in doc/en/reference, but hg says those changes are ignored. Suggestions?

Cheers!

06bbf81d-f646-4d20-904a-9cca4deefa37 commented 10 years ago
comment:6

I found ways to fix the two problems I had before:

  1. After line-folding an example (using ....: ) which need keywords such as 'rel tol', those keywords must be sitting on the first line. Otherwise they have no effect. So, I folded (I hope) all lines neatly that end up in the documentation.

  2. About the rst documentation problem -- I was modifying the directory structure of doc/en/reference/calculus/ instead of just the file index.rst. Now, adding documentation works for me. So, I added it.

Also I expanded the text of the documentation to make clearer why those examples matter. And I moved all imports to top level, so that function calls are not slowed down by those. For some reason %time registers that change (in Wall time), while timeit() does not...

I attach the updated patch.

vbraun commented 10 years ago
comment:7

As an interface that is presentable to the end-user I would propose

sage: integral.symbolic(...)
sage: integral.numerical(...)
sage: integral.multidim(...)
sage: integral.multidim_numerical(...)
sage: integral(...)   # auto-guess

The cuba library interface shouldn't do any symbolic integration. That is not only unrelated, most numerical options also don't make sense for symbolic integration.

06bbf81d-f646-4d20-904a-9cca4deefa37 commented 10 years ago
comment:8

Let me for now address only your first comment, since the rest are secondary issues.

I completely agree with you that such a split of the interface must eventually happen. However, at the practical level, I do not think it is within the range of my current sage abilities or that I have enough motivation to do it. Here is why:

I've had way too many issues with integrate() to even start thinking about a multidimensional symbolic integrator. Here is a laundry list of problems I've faced in case anybody cares to take up the problem:

  1. If there are parameters in the integral then there is no easy way to pass answers to maxima, when it starts asking "Is positive, negative or zero?", when is some mess (which usually implies that using assume() to fix it won't work), which for multidimensional integrals is what you usually get in my experience. If there are no parameters, then all you care about is getting some number -- so, there is not much point in the analytical answer itself (in my opinion of course).

  2. In many cases I tried (after transforming to the unit cube), maxima gives back an answer only once I avoid the boundaries of the interval and use the principal value of the integral. But to avoid them I need to introduce an epsilon,i.e. a parameter -- which brings me back to point 1 above.

  3. sympy.integrate is actually in many cases much nicer (in my opinion), giving back piecewise answers, having a somewhat better handle on special functions such as Ci(x), etc. But sage doesn't support the piecewise answers; and SR(sympy.Ci(x)) gives an error, even though sage has Ci(x). E.g. try this: integrate(cos(x)/x,x,algorithm='sympy')

  4. If the analytical integral is not doable, then the question remains what should the symbolic result be. Just the first function multidim_analytical_integration_unit_cube returns? Or all of its results as a list? And handling anything but constant intervals (as over the unit cube in the current implementation) becomes somewhat of a mess, unless you transform your intervals to the unit cube. But in that latter case, simple integrands with simple (but non-numeric) integration intervals very often become horrible messes when transformed to the unit cube, which even Mathematica can't handle.

So, really, the current multidim_analytical_integration_unit_cube is the best I can hope for for now. And as a comparison, Mathematica also has such a SymbolicProcessing option for its NIntegrate, which (at least at the user level) has nothing to do with doing multidimensional integrals with Integrate.

So, maybe a compromise would be simply to hide multidim_analytical_integration_unit_cube with an underscore and not advertise it at all to the end-user? Let me know what you think.

vbraun commented 10 years ago
comment:9

I don't have a good answer to symbolic integration, as you say symbolic multi-dimensional integration is usually a mess unless there is some symmetry to exploit.

As for 4), you can have un-evaluated symbolic expressions:

sage: x.add(x, hold=True)
x + x
sage: sage.symbolic.integration.integral.indefinite_integral(x, x, hold=True)
integrate(x, x)

So my suggestion is (and thats what I meant to say initially) to just focus on the numerical integration and interfacing with cuba here. Anything else can go to a separate ticket. We don't even have to think too much about a user-interface yet, which is what you have done anyways in the current version where you have to explicitly import from sage.libs.cuba. The integral() command can then later be enhanced to make it easy to use.

06bbf81d-f646-4d20-904a-9cca4deefa37 commented 10 years ago
comment:10

So, I guess the issue is immediate usability vs. code cleanliness. I myself clearly vote for the former -- I'd still like to include the symbolic preprocessing option with this patch. The option is really useful I think. Without it many integrals that I happen to stumble upon in my work are completely undoable numerically (which was exactly the reason I introduced it in the first place). That option is not used by default, so if anybody uses the integrator, the symbolic part would never kick in unless they actually enable it on purpose, which would mean they were desperate (as I was ...).

I do realize the symbolic option is unrelated to cuba, and I have no problem if at some point that part of the code is moved to a separate lib, once an actual symbolic multidim integrator is put in place. It is well separated in the code, so transitioning to a separate lib should not be an issue in terms of code maintainance. So, even if you favor code cleanliness, I don't think that the presence of that option would break long-term development/support.

Now, of course I'm just a first time contributor, and only a month-long user of sage, which will soon be a decade old. So, I'll understand if you still decide to kill the option.

vbraun commented 10 years ago
comment:11

Ok, it wasn't clear to me that the symbolic preprocessing is that important. It is still unrelated to cuba, so how about moving it to sage.libs.cuba.symbolic_preprocessing.py or something like that? Otherwise it makes it just harder to debug and read the cuba.pyx file. Similar for C/Cython source code generation.

It also would be nice to split up multidim_integration some more. For example, the entire if (symbolic) block would be much better as a separate function: Makes multidim_integration shorter and easier to read if you don't have to juggle with temporary variables funcS, newvars. E.g.

   if symbolic:
       try:
           return multidim_integration_symbolic(...)
       except SymbolicTimeout:
           pass
06bbf81d-f646-4d20-904a-9cca4deefa37 commented 10 years ago
comment:12

Thank you for all the useful comments, Volker!

Below are my modifications in response to your comments.

Best, Svetlin

-- Done.

-- moved to new file: cython_compilation.pyx and tested.

-- There are no dynamically loaded modules in the code (at least as far as I understand the definition). I moved all of them to top level so that loading modules does not slow down the code.

-- switched to eval

-- fixed. Removed most try statements, which I realized I only needed in the code development.

-- fixed.

-- Thanks! Moved to new file: symbolic_preprocessing.pyx. Added underscore in front of _multidim_analytical_integration_unit_cube.

-- Moved to new file: cython_compilation.pyx

-- Cleaned that part of the code a lot. Temp vars were actually left over cruft, which I removed. Now that piece of the code just checks whether integrand is constant and returns; or not, and continues.

Also, made additional minor tweaks and fixed a bug I had introduced at some point for Python function integrands, and for which I was missing an important test.

06bbf81d-f646-4d20-904a-9cca4deefa37 commented 10 years ago

updating patch according to Volker Braun's comments

06bbf81d-f646-4d20-904a-9cca4deefa37 commented 10 years ago
comment:13

Attachment: trac_15492_multidim_numerical_integrator.patch.gz

-- Updating patch against sage 5.13.

-- Removing left over unnecessary imports.

06bbf81d-f646-4d20-904a-9cca4deefa37 commented 10 years ago

Attachment: trac_15492_multidim_numerical_integrator.2.patch.gz

patch against sage 5.13

rwst commented 10 years ago

Branch: u/rws/implement_multidimensional_numerical_integration

rwst commented 10 years ago
comment:17

I have made a branch so that people can work better with your code. It however depends on the existence of the cuba package, which isn't even optional at the moment. So this needs work in that both the package needs to be included in Sage (please open a separate ticket) and this code should fail gracefully if the package is not install. This is all detailed in the developer manual.


New commits:

a9b78d7Trac 15492: add multidim numerical integration
rwst commented 10 years ago

Commit: a9b78d7

c22b6800-ec0b-4cbf-94c4-0a74aecc2093 commented 7 years ago
comment:19

+1 for having Monte-Carlo integration in Sage.

I recently needed this functionality, and resorted to using the Cubature package; Python bindings exist, and I found the interface quite nice. I guess the difficult part would be to rewrite cubature.py for Sage..

stassev commented 1 year ago

After this many years, I updated the code. I refactored it to drop the CUBA dependency and instead to use the internal SageMath integrator.

The code can do symbolic and numerical integration on non-trivial domains.

I posted the code here:

https://github.com/stassev/sagemath-multidimensional-integration

If people are still interested, feel free to incorporate it into SageMath. I can try to help as time permits.