Handle multistep methods

drhagen / scipy_ode

Collaboration repository for redesigning scipy ODE solvers

BSD 3-Clause "New" or "Revised" License

1 stars 4 forks source link

Handle multistep methods #14

Closed drhagen closed 8 years ago

drhagen commented 8 years ago

I realized one possible limitation of the current design of the step solvers: they don't make it clear how to write a multistep method. Right now, solve_ivp assumes that the only state that is needed to evaluate t is the state immediately before t and the state immediately after t. Those are the only states that the integrator keeps and sends to interpolator in order to do event detection.

Possible solutions:

Keep things the way they are. Each state of a multistep method should keep all the past state that it needs. It may seem that this is wasteful in terms of memory, but it is not actually that bad because the past state will only be pointers to the same shared arrays.
Each state keeps a pointer to the immediately previous state object, like a linked list. Multistep methods can easily go back in time to get the state that they need and no changes are needed to the integrators. However, this makes it impossible to keep only a part of a solution. Memory would no longer be freed by dropping the first 90% of states because the remaining states would always keep a pointer all the way back to the beginning.
Add a field called step_memory or similar to the solver. This would be 1 for one-step methods and so forth. It would be the responsibility of the integrator to read this field and send enough extra state to satisfy the first state that is needed for the requested time.
Add the step_memory field to the state, because the number of past steps required might vary from step to step. There is little to be gained by this method over the previous method because step_memory on the solver could be set to the maximum number ever needed by that method, and taking a few extra states is harmless.
Remove interpolator and have step return an interpolator for the previous step. The solver needs to keep enough past state anyway in order to evaluate the next point. When doing event handling, these small interpolators all need to be built anyway at each step. The integrator would be responsible for evaluating the correct time window. This method has the additional benefit of simplifying the interface by allowing the OdeState class to be removed.
Change interpolator to take no arguments and return an interpolator only for the previous step. This is exactly the same as the previous method except that it allows the integrator to avoid the cost of building the interpolator if it not needed (this will happen in a solve_ivp_discrete method.). This will require that the solver store the previous state as well as the current state, but I don't perceive that to be a problem. This also allows OdeState to be removed.

aarchiba commented 8 years ago

My vote is for option 4; I don't think there's any essential global structure to an interpolator, so constructing one per step shouldn't cost any extra. This would be different if the interpolators were like cubic splines where continuity of derivatives was imposed as a global constraint, but all dense-output methods I know of obtain their continuity locally, from the solver state. That means ODE solution objects need to decide which step the input point falls into and forward evaluation to the appropriate interpolator; this is what happens in PPoly objects, but they already lack the generality to handle all the dense-output interpolators.

drhagen commented 8 years ago

I personally like (6), (4), and (1), in that order. My biggest concern with (5) and (6) is performance, even though there is something elegant about hiding the concept of state entirely.

nmayorov commented 8 years ago

I studied BDF multi-step method well enough now and want to share my opinion.

The state of a practical BDF solver (as well as Adams-type solvers) contains k + 1 (k is the number of steps) parameters which defines the interpolating polynomial (there are several different implementations), so it is alone enough to compute interpolation polynomial. It means that the concept that interpolator interpolates between states is not very helpful in practice.

As for the general approach for interpolating, I think:

A solver should return the most recent interpolator.
A solve should build the global interpolator (by stacking coefficients). This is necessary to make an efficient interpolator object (PPoly, see below).

Both are optional. I believe that our best bet is PPoly, because it provides efficient evaluation and I would argue that conceptually it is the universal approach, even though representing interpolator in the local power basis may be suboptimal for numerical evaluation. I mean that PPoly is fast, tested and already available, whereas writing new "local interpolator stacker" object will take time and won't be efficient if we are going to assume that local interpolators are some arbitrary functions.

I know I didn't expressed it all very well, but just wanted to finish this message now.

drhagen commented 8 years ago

The state of a practical BDF solver (as well as Adams-type solvers) contains k + 1 (k is the number of steps) parameters which defines the interpolating polynomial (there are several different implementations), so it is alone enough to compute interpolation polynomial. It means that the concept that interpolator interpolates between states is not very helpful in practice.

So I understand that this is a solver whose value between two time points t1 and t2 depends on states at points before t1. Does it depend on all the points before that, or just a finite number of them? Basically, is k the total number of completed steps or a parameter of the method? If it is all of them

Both are optional. I believe that our best bet is PPoly, because it provides efficient evaluation and I would argue that conceptually it is the universal approach, even though representing interpolator in the local power basis may be suboptimal for numerical evaluation.

Isn't the required interpolator defined by the method? We can't require a specific interpolator be used without eliminating a bunch of solvers, right? Or are you talking just about the BDF family?

It means that the concept that interpolator interpolates between states is not very helpful in practice.

You have to have some ability to make an interpolator between the last two states in order to do event detection. So every solver must support this capability somehow. And once you have this capability, you can trivially save the solution at all time points by saving those interpolators after each step. Now, there may be performance considerations in these two cases: (1) you don't have events and building the interpolators one at a time is more expensive than saving all the states and building the interpolator once, (2) even if you have events, evaluating the saved step interpolators is more expensive than also saving all the states and rebuilding the global interpolator and evaluating that instead. I think (1) could be true which suggests that option 4 is best; I don't think (2) could be true.

nmayorov commented 8 years ago

So I understand that this is a solver whose value between two time points t1 and t2 depends on states at points before t1. Does it depend on all the points before that, or just a finite number of them? Basically, is k the total number of completed steps or a parameter of the method? If it is all of them

Theoretically it depends on k past values of y(x). My point was that the solver doesn't store these points explicitly, instead he uses another convenient parametrization (divided differences or "Nordsieck array"). And the main point is that we should not think of a state of a multi-step method as a list of "states of one-step method". Is it more clear now?

Isn't the required interpolator defined by the method? We can't require a specific interpolator be used without eliminating a bunch of solvers, right? Or are you talking just about the BDF family?

Yes, we can assume that a solver provides some general interpolator. But I suggest to rely on PPoly interpolator in our solvers and construct the global interpolator inside the solver (by stacking coefficients). This way we stay general, but can easily implement efficient interpolator.

You have to have some ability to make an interpolator between the last two states in order to do event detection.

I'm arguing that interpolation between states doesn't always makes sense in practice. As I said before --- the state of BDF is already interpolator, so we don't need two states to interpolate. It leads me to believe that interpolator itself should be provided by a solver, not states, because the interpolation mechanism differs for different solvers.

And once you have this capability, you can trivially save the solution at all time points by saving those interpolators after each step.

My main arguments agains that are efficiency and cost to implement it. You see, PPoly already implements the mechanism of finding necessary intervals for vectorized evaluation and it is all efficiently implemented in Cython. Storing, say, 1000 interpolator callables in a list and then somehow navigating through them, evaluating one-by-one and composing the final result --- it looks like quite poor and inefficient approach.

So I think both (1) and (2) from your message are very real and need to be avoided.

aarchiba commented 8 years ago

You have to have some ability to make an interpolator between the last two states in order to do event detection.

I think it may help to think of the solver state as associated with a (successful) step. So a solver should contain the starting and ending values of t and y, any working state for the solver to continue, and any information required to interpolate within the current step.

This is also helpful, because unlike splines, dense output objects need no global information (e.g. continuity constraint solving) - they are constructed from the solver's current step.

I believe that our best bet is PPoly, because it provides efficient evaluation and I would argue that conceptually it is the universal approach, even though representing interpolator in the local power basis may be suboptimal for numerical evaluation. I mean that PPoly is fast, tested and already available, whereas writing new "local interpolator stacker" object will take time and won't be efficient if we are going to assume that local interpolators are some arbitrary functions.

I believe that I originally wrote PPoly. There is no magic in it. It operates in two steps: it uses np.searchsorted (or its own compiled binary search) to determine which piece each input point is in, then it evaluates the relevant polynomials there. If we were to write a replacement (I have), it would use the same first step to locate the right solver step, then use the solver-provided dense output object to evaluate the solution where needed. This is an easy piece of code.

The power basis will generally destroy all numerical accuracy by degree about 20. Bulirsch-Stoer integrators routinely produce dense-output polynomials of order up to about 30, specified by values and derivatives at the endpoints, and values and many derivatives at the middle. This can be evaluated directly to sufficient accuracy using KroghInterpolator, but conversion to the power basis will wreck any kind of error guarantee. So please let's don't build that restriction into our interface.

drhagen commented 8 years ago

I believe that I originally wrote PPoly.

Well, this is convenient.

There is no magic in it. It operates in two steps: it uses np.searchsorted (or its own compiled binary search) to determine which piece each input point is in, then it evaluates the relevant polynomials there. If we were to write a replacement (I have), it would use the same first step to locate the right solver step, then use the solver-provided dense output object to evaluate the solution where needed. This is an easy piece of code.

This is what I suspected. If we had a list of step interpolators (each of which would be a simple object with a __call__ method), we would use searchsorted to find the right piecewise section and then __call__ it to evaluate the polynomial at the given time. This is the same thing that PPoly does internally. The algorithmic cost is identical. The question that remains is what is the overhead of constructing a Python object for each step? And what will Cython do with the search and call function? I don't know if that would be slow or fast.

I think we should move forward with the cleanest solution, and come back to the drawing board if it turns out to be slow. I am not 100% convinced option (6) is the most elegant solution, even if it turns out to be fast. There is something I don't like about the solver saving having to save past states that I don't like. It's like I want the current state of the solver to be the only state that's part of the solver.

You have to have some ability to make an interpolator between the last two states in order to do event detection.

I'm arguing that interpolation between states doesn't always makes sense in practice. As I said before --- the state of BDF is already interpolator, so we don't need two states to interpolate. It leads me to believe that interpolator itself should be provided by a solver, not states, because the interpolation mechanism differs for different solvers.

In my head anyway, the interpolator is always defined by the solver. No matter what, the interpolator will have to be a separate class from the solver class. For the solvers now, that separate class is PPoly. Whether an instance of this class is returned at each step or the states are all collected and passed to a method together to get a single instance of this class is, I think, what we are debating.

aarchiba commented 8 years ago

Maybe it would help to have a survey of how dense output objects are constructed by solver algorithms. I'll list the ones I know about:

Explicit Runge-Kutta solvers evaluate the solution and its derivative at a number of specified points within the last step (including the ends, obviously) and these specify a Hermite interpolating polynomial.
Bulirsch-Stoer computes the function value and an order-dependent number of derivatives at the middle of the step and combine these with values and derivatives at the ends to specify a Hermite interpolating polynomial.
Collocation methods (including Radau and symplectic Gauss integrators) advance the solution by finding a polynomial that has specified derivatives at the zeros of an orthogonal polynomial. This collocation polynomial is the natural dense output object; while I think it should be naturally represented in an orthogonal polynomial basis, I think most solvers use the power basis.
The adaptive multistep methods I know about maintain, at each step, a "Nordsieck array" of derivatives inferred from the last steps (how many is order-dependent, and I think like B-S some can change orders as the error control dictates) that can be used to construct a Hermite interpolating polynomial.
For low-enough order methods you can just use the function value and derivative at the beginning and end of the step, but this doesn't necessarily satisfy anyone's error bounds.

Are there other major approaches to dense output?

Computing values and derivatives of the Hermite interpolating polynomial with the best possible numerical stability is done directly from a tableau of function and derivative values, as is done in KroghInterpolator. (Which is why I wrote it.)

Collocation methods necessarily build the polynomial construction into the solver algorithm, so the polynomial representation to use is determined by that algorithm. Evaluating polynomials and their derivatives in the power basis is pretty straightforward, while for orthogonal polynomials I think you'd do it via Clenshaw's method.

The point of this is, in no case is there a need for state from multiple steps; there is always a natural collection of information from the current step that is sufficient to construct the dense output object.

nmayorov commented 8 years ago

@aarchiba --- I think it covers it really well.

Regarding Krogh interpolator --- is it known to be more accurate / stable for numerical evaluation than converting to PPoly basis and evaluating? I didn't find mentions about it in the paper and the docstring notes that enhanced numerical stability is not guaranteed.

So all cases are in fact polynomials determined by values and derivatives, so using Krogh interpolator or converting to PPoly seems the only options I can think of. I would say that Krogh looks very promising, especially if it is more numerically stable.

aarchiba commented 8 years ago

Regarding Krogh interpolator --- is it known to be more accurate / stable for numerical evaluation than converting to PPoly basis and evaluating? I didn't find mentions about it in the paper and the docstring notes that enhanced numerical stability is not guaranteed.

Yes, absolutely KroghInterpolator is more stable than converting to the power basis. Converting bases of polynomials is always a numerically unstable process, and the instability manifests shockingly fast. I put that comment in the docstring because the algorithm isn't as stable as that used in BarycentricInterpolator; unfortunately BarycentricInterpolator can neither handle derivative constraints nor evaluate derivatives.

I agree that KroghInterpolator is the answer in many cases, but collocation methods work the polynomial construction intimately into their algorithm, so they might produce polynomials in the power basis just because that's how the implementation works. (I still think representing them in a basis of orthogonal polynomials makes sense but I've never seen a reference to anyone doing that.) That said, it's not clear to me that collocation methods necessarily produce a polynomial approximation of the right order - the Gauss symplectic integrator I was looking at is of order 2s but the collocation polynomial is only of order s, so it's quite inefficient to achieve accuracy requirements on the dense output that way.

All that said, I still think it's important to be able to support existing codes as much as we can, and while these may in fact be producing polynomials as dense output, the only access to these polynomials may be through the code's dense output routine. So I think it's valuable to permit "black box" dense-output routines; as long as they support the same interface, the objects that link many together to form a solution object can treat them interchangeably.

Anyway, the point of doing this here is to argue that if we think of a Solver's state as associated with a particular step, then there's not really any need to use multiple states to construct dense output objects; even for multistep methods, there isn't really any possibility to reuse data from previous steps.

drhagen commented 8 years ago

So that we have a point of reference, I went ahead and implemented option (6), it is available in the main branch. If we decide that we want to compare it with other designs, we can start making branches. It required quite a bit of change, but I think the design it good. It is a very clean interface. Also, I think it will fast, even faster than the previous design, which had to rebuild the interpolators for both events and for the final solution. This design builds the interpolator only once. There is still quite a bit of room for improvement, but I didn't want to spend too much time nitpicking until we were confident that providing an interpolator for the last step was the solution that we wanted.

Let me know what you think.

nmayorov commented 8 years ago

Hey David, I actually updated the code in the scipy PR. Surprisingly (or not) I chose the same approach without any abstraction for ODE state. And generally I feel I followed most of the things we decided here. But not all, for example I didn't find a good reason to separate discrete / continuous solving, but also didn't add a support for evaluation at extra points yet. Really not sure.

How do you feel about continuing the work in scipy PR? I think it will lead to the final code faster, also more people will see it. You can commit in my branch if you want to, there is quite a lot of technical work remains. Would be good if you can review the updated state.

drhagen commented 8 years ago

Wow, we did end up with a very similar solution. I like your design of moving the common code to step etc. and having it call another abstract method. There a bunch of tests that need to be merged in, as well the corner cases that they test. There are also some other things, but I'll put those as separate issues when I get a better understanding of your code.

In any case, it seems that we are agreed that having a method that steps and a method that returns a callable for the last step is the solution we should pursue for the common interface of the solvers. I'll mark this issue as closed.