Proposal: don't re-compute the vector field and map twice when using ForwardDiff

JuliaDynamics / DynamicalSystemsBase.jl

Definition of dynamical systems and integrators for DynamicalSystems.jl

Other

53 stars 27 forks source link

Proposal: don't re-compute the vector field and map twice when using ForwardDiff #17

Closed tkf closed 6 years ago

tkf commented 6 years ago

Reading how ContinuousDS(prob::ODEProblem) w/o Jacobian:

# Constructors without Jacobian:
function ContinuousDS(prob::ODEProblem)
    state = prob.u0
    eom! = prob.f

    D = length(state); T = eltype(state)
    du = copy(state)
    J = zeros(T, D, D)

    jeom! = (du, u) -> eom!(0, u, du)
    jcf = ForwardDiff.JacobianConfig(jeom!, du, state)
    ForwardDiff_jacob! = (t, u, J) -> ForwardDiff.jacobian!(
    J, jeom!, du, u, jcf)
    ForwardDiff_jacob!(0, state, J)

    return ContinuousDS(prob, ForwardDiff_jacob!, J)
end

and variational_integrator:

function variational_integrator(ds::ContinuousDS, k::Int, T,
    S::AbstractMatrix; diff_eq_kwargs = DEFAULT_DIFFEQ_KWARGS)

    f! = ds.prob.f
    jac! = ds.jacob!
    J = ds.J
    # ... chop ....
    veom! = (t, u, du) -> begin
        us = view(u, :, 1)
        f!(t, us, view(du, :, 1))
        jac!(t, us, J)
        A_mul_B!(view(du, :, 2:k+1), J, view(u, :, 2:k+1))
    end
    # ... chop ....
    return vintegrator
end

are defined, I started worry that this implementation may be inefficient, since you call eom! (prob.f) twice. Discrete system also has the same problem. I haven't done any benchmark on this, though.

Here's an implementation for not calling eom! twice (as you know :wink: ): https://gist.github.com/tkf/014787f386e99b7b8b42683cfbd8da01#file-lyapunovexponentswithforwarddiff-jl-L27

The implementation itself is not hard. The main problem is that you need to change the API. That is to say, you need to store the pair eom! and (say) "eom_and_jacob!" rather than eom! and jacob!. Of course, it is easy to make a backward-compatible layer by calling eom! and jacob! from the default "eom_and_jacob!". This also helps user to provide a highly efficient implementation.

(A bit tangential notes: I know that the name eom_and_jacob! is not cool. Not sure what's the best name. tangent_dynamics!? coeom! (co-eom)? teom! (tangent-eom)? veom! (variational-)? peom! (paired-)?)

I can do the fix but I thought I'd ask you first, since this is going to change the API (and there is a chance that you have done benchmarks and know that the performance benefit is negligible).

Datseris commented 6 years ago

Hello.

I must say i did not immediatelly "get it" ...

Where is the eom computed twice? In the variational integrator, where it is computed once for the f!(t, us, view(du, :, 1)) and once inside the Jacobian (if it is forward-differentiated?).

Alright, regardless, this is of course a great suggestion. I would go with tangent_dynamics!, to "err on the side of clarity". The Jacobian function itself should be stored regardless, because it must be part of the system so people can call it for other reasons. We can just pass around the extra function in e.g. lyapunovs, after it is created, no need to store it in the ContinuousDS type. Or am I missing something?

(This means that there is no API change, only internal changes)

Once again, please wait until https://github.com/JuliaDynamics/DynamicalSystemsBase.jl/issues/18 is over to not do extra effort :)

tkf commented 6 years ago

I mean, veom! in variational_integrator calls:

f! (== ds.prob.f == eom!) (first time)
jac! (== ds.jacob! == ForwardDiff_jacob!) which calls jeom! which, in turn, calls eom! (second time)

i.e., eom! is called twice. Is it correct?

If so, it's better to skip the first call and use the result from ForwardDiff.jacobian!. You are allocating the output of eom! as du in ContinuousDS but not referencing it from anywhere else. If you can somehow use it in veom! (my proposal here), you don't need to call the original phase space eom!.

Datseris commented 6 years ago

Yes you are absolutely correct!

If you did a PR about this I would gladly merge. But its best to wait to do it after the DiffEq interface changes , probably next weekend. Because this part will also be affected .

On Jan 22, 2018 10:37 AM, "Takafumi Arakaki" notifications@github.com wrote:

I mean, veom! in variational_integrator calls:

f! (== ds.prob.f == eom!) (first time)

jac! (== ds.jacob! == ForwardDiff_jacob!) which calls jeom! which, in turn, calls eom! (second time)

i.e., eom! is called twice. Is it correct?

If so, it's better to skip the first call and use the result from ForwardDiff.jacobian!. You are allocating the output of eom! as du in ContinuousDS but not referencing it from anywhere else. If you can somehow use it in veom! (my proposal here), you don't need to call the original phase space eom!.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JuliaDynamics/DynamicalSystemsBase.jl/issues/17#issuecomment-359370307, or mute the thread https://github.com/notifications/unsubscribe-auth/ASwgYQn1I2afa9NvZi0_QBFs6JIpTwDYks5tNFbjgaJpZM4RkS3U .

tkf commented 6 years ago

Actually, not sure I want to do a PR anymore for this, since I've already made up my Lyapunov exponents library. My initial motivation was to use DynamicalSystemsBase.jl if it provides a nice abstraction and efficient implementation. But at this point I thought independent evolution of the libraries would be easier.

Of course, I can comment on the design process if you want. I also think designing a new tangent space API and #18 better be done together, since both would cause breaking changes. (And maybe the coding better be done at once?)

As just a quick comment, I can think of those design axes in the API design:

In the DS constructor, accept only jacob!? Or only tangent_dynamics!? Or both? (I guess having both makes sense, if you want a full generality, because LE calculation only needs tangent_dynamics! but some ODE integrator wants jacob!.) BTW, why not call it jacobian! to be compatible with {Forward,Reverse}Diff.jl?
Different DS struct type for having or not having jacob! and/or tangent_dynamics!? Or use the same DS struct and check if the field exist?
Store generated jacob! and/or tangent_dynamics!? Or make them only accessible via methods? (The latter may have a performance benefit but not sure if recent Julia still has performance penalty for closures.)

tkf commented 6 years ago

Another axis is whether or not to store the tangent/phase dynamics as the ODEProblem and DiscreteProblem. In some sense, having only tangent_dynamics! function is asymmetric, because you don't have the initial value of the tangent state.

Datseris commented 6 years ago

Yes definitely developing 2 libraries independently is easier, but only for you , the developer. For the user it is much harder to have many libraries that do the same / similar thing.

In my eyes, as a user, this is one of the biggest problems of the Julia package system at the moment. 10s of packages that do the same or very similar things. Of course you can say arguments like "I did it because I liked it" or "for practice" etc. Which is absolutely fine and understandable. But, why doesn't anyone try to work together and make 1 big and awesome thing instead of small similar thungs? It always makes me sceptical...

This also the reason I made my packages an organization; so that they are not mine anymore and people would be more inclined to contribute and improve to something existing already.

Aaaanyway: here is my take on it to reuse code and make everything better: you can put (if you want to) this package in the organization. Then , continuous systems are bound to the ODE problem so we change the lyapunkv functions to use yours and make an interface between. This will be trivial. We can keep my small implementation as an example or as the case without extra options.

About the discrete systems, what is the benefit on using the DiffEq ones? I did not know that they existed when I started these packages so I made my own discrete systems which I guess are a just as fast (if not fatser due to less fields and structs?)

On Jan 23, 2018 3:26 AM, "Takafumi Arakaki" notifications@github.com wrote:

Another axis is whether or not to store the tangent/phase dynamics as the ODEProblem and DiscreteProblem. In some sense, having only tangent_dynamics! function is asymmetric, because you don't have the initial value of the tangent state.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JuliaDynamics/DynamicalSystemsBase.jl/issues/17#issuecomment-359652668, or mute the thread https://github.com/notifications/unsubscribe-auth/ASwgYQO6VvfFcfC04GVotfsS2Bffjm3lks5tNUNlgaJpZM4RkS3U .

tkf commented 6 years ago

In my eyes, as a user, this is one of the biggest problems of the Julia package system at the moment. 10s of packages that do the same or very similar things.

I understand this point and I thought so a while ago too. But involving in/observing other OSS communities (Python, Emacs Lisp, etc.), I think a bit of diversity of packages always exist. As you said, OSS is mostly developer-driven and there are many for-fun projects. This maybe is a good sign that the OSS community is healthy.

Aaaanyway: here is my take on it to reuse code and make everything better: you can put (if you want to) this package in the organization. Then , continuous systems are bound to the ODE problem so we change the lyapunkv functions to use yours and make an interface between. This will be trivial.

I don't think we need to rush. My implementation at this point doesn't provide more algorithm than yours. There's just one Lyapunov spectrum calculation. I also am not prefect sure about my API. It's at very early stage and I may change it at some point. But I'm not stopping you to do that (I'm just not recommending it).

As for adding LyapunovExponents.jl to JuliaDynamics, I'm not sure. I'll be OK with it as long as using DynamicalSystemsBase.jl is not the hard requirement of being in JuliaDynamics. I'm fine with making DynamicalSystemsBase.jl a soft requirement though. It's a good idea that the famous systems provided by DynamicalSystemsBase are usable everywhere.

The main reason why I don't want to use DynamicalSystemsBase.jl, at least at this point, is that because DifferentialEquations.jl already provides a very flexible container type for dynamical system definitions. I need DifferentialEquations.jl anyway so why do I want to increase dependencies? Also, DifferentialEquations.jl is heavily developed and settled in a very flexible API.

Datseris commented 6 years ago

The main reason why I don't want to use DynamicalSystemsBase.jl, at least at this point, is that because DifferentialEquations.jl already provides a very flexible container type for dynamical system definitions. I need DifferentialEquations.jl anyway so why do I want to increase dependencies?

Alright, there is a limit for how much reinventing of the wheel can be done and I can say that it is "fine" and I think I have already wasted enough time, so this will be my last post on the conversation.

Well, here is the thing: the only dependencies of DynamicalSystemsBase that DO NOT exist in DifferentialEquations is NearestNeighbors. So the "extra" dependencies argument is completely unreasonable, at least not in a logical conversation about actually helping the community.

Secondly, what do you think a ContinuousDS is? It is just a bundle of the ODEProblem (that you already use) and the jacobian of the vector field (that you already use). This is exactly what you need to calculate lyapunov exponents, nothing more (and since I also know how to compute lyapunov exponents, nobody can convince me otherwise). So the ContinuousDS is actually what you already use, nothing more, nothing new.

But here is the extra part: ContinuousDS is not only extremely well tested and well documented, not only has 10s (if not more I don't even remember how many they are) famous systems, but integrates perfectly with DifferentialEquations, ChaosTools, and all future additions to this ecosystem. In addition, it has a different (and much more efficient method) on computing the single largest lyapunov exponent (I hope you dont want to re-write this as well?).

How amazing is it that you do not have to find/create/search for a different package to calculate lyapunov exponents and attractor dimensions, which are so closely related? As a scientist, this makes me as happy as anything else in the world, which is related to nonlinear dynamics. So why not use it this DynamicalSystemsBase code? When there is something that already provides the core needs of your implementation and also offers so many great benefits? You have made all these complex types and structures when you could write so much less by using something that already exists! You could still do this in a separate package with your name, and there would be no difference for me, since I care about helping science.

I am sorry but at the moment I cannot find any argument on why to re-write almost everything from scratch and not re-use nothing, even basic stuff. The only reason I can think is personal benefit instead of community benefit (having fun also counts as just personal benefit). This is of course acceptable, I do not judge it as a bad thing. But it is not the dream I had in mind when starting to work on this open source project ( -> amazing companion for all scientists of the field).

Therefore you can continue working on your package, but there is no reason to discuss anything further here. Please only reply to the comments if you think you have suggestions on how to solve issues from now on, and I promise I will also do the same. If you want to discuss this further please use our gitter channel, which is more fitting for these kind of discussions.

Finally, let me make this perfectly clear for users that may want to contribute on the present issue:

This issue only affects systems where the Jacobian function is computed automatically using ForwardDiff. It has no bearing whatsoever on systems where users give the Jacobian function

Of course, if you care about performance (which this issue is all about), you give your own Jacobian anyways, making this issue an extremely very low priority.

tkf commented 6 years ago

I was talking about why I'm not using DynamicalSystemsBase as internal data representation in my package. I'm not opposed to the idea of DynamicalSystemsBase.jl and perfectly fine for providing API to accept those types. If you read my comment above, I said:

I'm fine with making DynamicalSystemsBase.jl a soft requirement though. It's a good idea that the famous systems provided by DynamicalSystemsBase are usable everywhere.

So, end-users won't notice and they don't care. Aren't you happy with that? I also said I'm not stopping you to use LyapunovExponents.jl as an external dependency. I was just saying the API is not stable at this point and I didn't want you to do extra work because of my API changes. Anyway, if you care about UX design and think that providing a meta package for single entry point would help it, just go for it. Having multiple libraries as backends is not incompatible with that.

Re: your comments

I am sorry but at the moment I cannot find any argument on why to re-write almost everything from scratch and not re-use nothing, even basic stuff.

Because the basic stuff (= DynamicalSystemsBase.jl) was not good enough. You said:

Of course, if you care about performance (which this issue is all about), you give your own Jacobian anyways, making this issue an extremely very low priority.

Well, if you care maximizing the performance, you should provide an API to compute phase space and tangent space dynamics altogether in one function. I think, for example, for a moderately large system (namely, if its parameter exceeds the CPU cache), computing them separately can slower than doing it together (in theory). Actually I should do benchmarks to say anything about performance. But with DynamicalSystemsBase.jl API I can't try all possibilities and compare them.

If you are throwing away opportunities for easily earnable performance benefit (like computing eom! twice, which is the main topic here), that's bad for end-users too. Don't you want to provide blazingly-fast default implementation that no one care to optimize?

I already wrote some comments to improve DynamicalSystemsBase.jl API because I do care about integrating people's effort. Why don't you respond to those technical comments (and stop doing politics :wink: ), if you care about community benefit?

Datseris commented 6 years ago

I am sorry if this was not clear from my comments: the suggestion to not compute the eom! twice is absolutely brilliant and a great benefit. There is no doubt about that nor do I doubt it.

If a user provides their own Jacobian, there is no 2-times computation of the Jacobians. This the fastest implementation, as it doesn't use forward diff and everything is computed once. Clearly, no matter how magic ForwardDiff is, not using automatic differentiation is faster. If you think otherwise, please show me where this "twice" computation happens in source code without ForwardDiff and I will immediately open an issue for it.

Well, if you care maximizing the performance, you should provide an API to compute phase space and tangent space dynamics altogether in one function.

This makes absolutely no difference in performance if you do not use ForwardDiff. If you do use FowardDiff then you are perfectly right. This is what the current issue is about so let's not repeat it many more times, I already know it :P

Don't you want to provide blazingly-fast default implementation that no one care to optimize?

Of course, which is why this issue is still open and not closed. This is a welcomed addition, noone said otherwise. I am sorry if this was deduced from my comments, it is certainly not the case.

I already wrote some comments to improve DynamicalSystemsBase.jl API because I do care about integrating people's effort. Why don't you respond to those technical comments (and stop doing politics 😉 ), if you care about community benefit?

I did not see any such comments, I only saw a link that showed where this is done: https://gist.github.com/tkf/014787f386e99b7b8b42683cfbd8da01#file-lyapunovexponentswithforwarddiff-jl-L27

there isn't anything to comment there. it is just that somebody has to do it. That is why I did not comment. Of course the link is appreciated. Maybe I should have commented "thanks for the link", to make it clear that it can be used, I realize that now. For all other suggestions I have commented in great detail.

tkf commented 6 years ago

Don't you think cache locality problem matters here? Have you ever heard of matrix multiplication being faster/slower when you do it row/column wise? (It depends on language actually. Optimal access is different in Julia/Fortran/etc. and C/Python/etc.) This is one instance of the cache locality problem. I suspect it might happen in the phase/tangent-space evolution.

Because of the cache locality (and also the reason in the next paragraph), I have some faith in ForwardDiff.jl. It may beat naive user's code if we use it wisely. But I need some more time to try things out.

(Actually, I should have mentioned that you don't even need to compute Jacobian. This is much more important. It should be possible to directly propagate the tangent vector. It would be a big win since you don't need to store the Jacobian. I'll try demonstrating that somewhere.)

BTW, the second half of this was another comment on API design I was talking about: https://github.com/JuliaDynamics/DynamicalSystemsBase.jl/issues/17#issuecomment-359648873

Datseris commented 6 years ago

Don't you think cache locality problem matters here? Have you ever heard of matrix multiplication being faster/slower when you do it row/column wise? (It depends on language actually. Optimal access is different in Julia/Fortran/etc. and C/Python/etc.) This is one instance of the cache locality problem. I suspect it might happen in the phase/tangent-space evolution.

I have heard the problem and you are right. Unfortunately my knowledge is mainly on nonlinear dynamics and chaos, not computer optimization, and thus I cannot do something about it :( I can only hope that somebody else wants to contribute, which is the only way that something truly great can happen.

We do use the "correct" way and use A_mul_B!. And we also do "explicit multiplications" columns first, which is the correct way for Julia. Something deeper than that goes beyond my knowledge level.

Actually, I should have mentioned that you don't even need to compute Jacobian. This is much more important. It should be possible to directly propagate the tangent vector.

Yes you are correct about this. I do not know how much actual benefit it will have though, because it may be really small. But it also may be really big. You are welcome to open an issue about this point, which I also feel like it is a great addition, and a separate suggestion from the current issue, which at many points has gone way off-topic.

The Jacobian function by itself is still useful as the present ecosystem is about chaotic dynamics in general not just lyapunov exponents. But regardless, the implementation of propagation of deviation vectors can be made much better, and this would also help the gali method!

Alright, regarding replies to suggestions you linked, number 2 is a no-go and number 1 and 3 seem to be the same in my eyes, seems like I may not understand the difference. As for why the name is jacob! instead of jacobian!, well because unfortunately I have developed everything by myself so far so there wasn't anyone to help with smarter name choices, which made me just not really care about names.

There is no reason to not store the bundle tangent_dynamics! in the structs. I think this is the way to go as I cannot find an argument against it.

(PS: I Though I have already said in https://github.com/JuliaDynamics/DynamicalSystemsBase.jl/issues/17#issuecomment-359159409 that this was the way to go. Maybe it was not clear enough?)

tkf commented 6 years ago

number 2 is a no-go and number 1 and 3 seem to be the same in my eyes

Each number was meant to be one axis for multiple choices. I had hard time understanding the statement "number 2 is a no-go" because your first comment (https://github.com/JuliaDynamics/DynamicalSystemsBase.jl/issues/17#issuecomment-359159409) is one of the choice presented in the axis 2 (kind of). I thought it's better to list out other possibilities, consider their pros and cons, and decide. That's why I tried to continue the discussion. If you want to make DynamicalSystemsBase.jl a core framework of dynamical systems research using Julia, I think such detailed considerations are necessary. I know it sounds a lot like bikeshedding, but it has to be done IMHO.

(BTW, if you want to continue API review including tangent_dynamics! but also including other stuff (I do have other nit-pickings :smile:), maybe it's a good idea to do that before #18.)

Anyway, let me clarify/restate the point about those axes. I was thinking that you might want to consider the following sets of choices:

Axis 1: DS constructor

Choice 1a: Optionally accept only Jacobian function
Choice 1b: Optionally accept only tangent dynamics function
Choice 1c: Optionally accept both Jacobian and tangent dynamics functions

To me, Choice 1a is a no-go because of the performance reason. I like Choice 1c, because of it is the most flexible interface.

Note that user can call Jacobian function even for Choice 2b if you choose Choice 3b. But I don't see any reason why to choose it except that coding is tedious.

OK. But you already said that

The Jacobian function itself should be stored regardless, because it must be part of the system so people can call it for other reasons.

in https://github.com/JuliaDynamics/DynamicalSystemsBase.jl/issues/17#issuecomment-359159409

I forgot that part. My bad. So Choice 1c, then!

Axis 2: DS struct design

(If Choice 3b below is taken, it's about how to represent missing Jacobian and tangent dynamics functions. If Choice 3a is taken, there is only one choice: 2a. I guess that's what you mean by "no-go"; i.e., 3b was no-go.)

Choice 2a: Use the same struct. Let them be undefined and use isdefined(ds, :jacob!) etc.
Choice 2b: Use the same struct. Let them be undefined but store the availability information in type parameter.
Choice 2c: Define different struct types.

I prefer Choice 2b or 2c. Choice 2b could buy us some performance benefit over 2a since we can eliminate isdefined calls.

Axis 3: Interface for calling Jacobian and tangent dynamics functions

Choice 3a: Store generated jacob! and tangent_dynamics! (current implementation).
Choice 3b: Provide methods to call them; e.g., something like tangent_dynamics!(..., ds) calls ds.tangent_dynamics! if it exists and otherwise generated version.

I prefer Choice 3b with combination of 2b or 2c, since it looks to me more idiomatic Julia solution.

Also, 3a may have some performance penalty if Julia has performance penalty for closures. I haven't looked it up this issue carefully though.

(Looking back, I should probably mention only Axis 3 first. But I was brainstorming in the order of execution.)

Datseris commented 6 years ago

Alright, thanks so much for all these detailed descriptions. Because a careful consideration is required for a worthy reply, you will have to wait until the weekend for that. I am sorry but at the moment I am extremely busy with other pressing matters, unrelated to Julia...

Two quick comments:

18 is already underway, for better or worse :P any change discussed here will happen after it. It does not matter though, since #18 truly is: "change (t, u, du) to (du,u,p,t) and make sure every single thing works".
There is no performance penalty for closures in Julia 0.6 (and later versions as well I guess).

tkf commented 6 years ago

I guess "before #18" was not I wanted to say. I meant "before releasing changes from #18". If you are breaking some API, why not break more :volcano:

And thanks for the info on closure! Let me know if you remember the URL of the performance info.

Thinking about this a bit more, maybe user-provided Jacobian function should have the calling signature jacobian_mul_B!(Y, B) which does Y = Jacobian * B. For tangent dynamics evolution, and probably in many other cases, you actually don't need to store the Jacobian in memory, as we discussed above; you just have to right-multiply it with some vector or matrix. With this API, if you need to store Jacobian in memory, just call it with B being an identity matrix. Of course, if there are many cases in which you actually need to store the Jacobian or if you need to left-multiply it (required for back-propagation in machine learning, aka reverse-mode auto-differentiation), this is not the ideal API. There is no such thing as a free lunch, I guess.

Datseris commented 6 years ago

And thanks for the info on closure! Let me know if you remember the URL of the performance info.

I think it is this one: https://github.com/JuliaLang/julia/blob/master/HISTORY.md#language-changes-1 and see the PRs that lead to it.

Once again, I am only quick-replying. Full replies need careful consideration over the weekend!

tkf commented 6 years ago

Sure. Please take your time. And thanks for the link!

I just found this issue about Jacobian in DifferentialEquations.jl. I think it's related:

This will let the user make use of a BandedMatrix from BandedMatrices.jl, or a sparse matrix, or even allows for matrix-free representations of the Jacobian. --- https://github.com/JuliaDiffEq/DifferentialEquations.jl/issues/220

(bold by me)

ChrisRackauckas commented 6 years ago

There's a lot going on here. What exactly do you need from your Jacobian?

That issue is for allowing the type of J to be chosen, so that it doesn't have to be a matrix. For example, it can be a lazy DiffEqOperator s.t. J*x is actually just a function call, etc. Is that what you need, or more?

tkf commented 6 years ago

Hi @ChrisRackauckas, thanks for joining in to the discussion! Yes, DiffEqOperators.jl sounds like the best option here. Probably accepting a function and passing it to update_func of the DiffEqArrayOperator constructor is good enough (for some UX gain). More advanced users can define their AbstractDiffEqOperators subtype.

Slightly related question: Is it possible for DifferentialEquations.jl to do some performance magic for linear ODE coupled with nonlinear ODE? Digging into this has been on my todo list ever since I've found out DiffEqOperators.jl. For Lyapunov exponents, currently we store the vector field of the nonlinear system in du[:, 1] (phase-space dynamics) and the linear system to du[:, 2:end] (tangent-space dynamics; i.e., du[:, 2:end] = Jacobian * u[:, 2:end]). The size of the linear system length(du[:, 2:end]) can be much bigger than the nonlinear one size(du)[1]. So, I wonder if there are some performance gain if we can tell DifferentialEquations.jl that du[:, 2:end] is linear and if it provides such interface.

ChrisRackauckas commented 6 years ago

Is it possible for DifferentialEquations.jl to do some performance magic for linear ODE coupled with nonlinear ODE?

You mean special integrators for those kinds of equations? Yes.

http://docs.juliadiffeq.org/latest/solvers/split_ode_solve.html#Semilinear-ODE-1

They were created for discretizations of PDEs, but should still apply here. Honestly, I don't think anyone has ever benchmarked these kinds of algorithms for this usage so it would be interesting. I'm not sure how helpful they would be though since they were created as an alternative to implicit solvers to handle the CFL stiffness of a PDE. But, who knows.

Also, the implementations are quite new. I can guarantee they work and converge with the right order, but until I do a blog post I have planned on pseudospectral PDE solvers, I'm not sure how they currently compare in efficiency. Also, without making the exp(dt*A) lazy they pretty much have to be fixed time step, and lazy exp(dt*A) only makes computational sense for large sparse A. So there's definitely a lot to play around with here and I'm not sure how it will turn out for non-stiff linear terms.

tkf commented 6 years ago

Cooool! Large ODEs constructed by coupling ODEs on some kind of lattice would have sparse Jacobian. So yes, having the performance boost even just for sparse linear systems would be great.

But, the document you sent me says:

The Semilinear ODE is a split ODEProblem with one linear operator and one function: du/dt = A u + f(t, u) where the first function is a constant (not time dependent) [...]

Can it work with the system where the first function is state and time-dependent? Like solving the system du/dt = A(t, u) u + f(t, u) or even more directly:

dx/dt = f(t, x)
dM/dt = A(t, x) M

ChrisRackauckas commented 6 years ago

The methods for semilinear ODEs have to assume the linear part is constant. I don't know of one that doesn't make that assumption, since otherwise it would have to be replaced with a quadrature rule that would break the assumptions of the method.

But you could make use of the more general IMEX form:

http://docs.juliadiffeq.org/latest/solvers/split_ode_solve.html#Implicit-Explicit-(IMEX)-ODE-1

where the Jacobian would then be A if that's the stiff (implicit) side.

Datseris commented 6 years ago

Oh man this is too much !

Disclaimer: I cannot do anything about the current issue, due to time constrains from my PhD.

(also, the current setup works perfectly fine and doing all this quite humongous changes brings small benefits)

Okay, here we go: @tkf

I guess "before #18" was not I wanted to say. I meant "before releasing changes from #18". If you are breaking some API, why not break more 🌋

Version 1 is not released yet so one can always introduce breaking changes. In addition, I feel like the discussion present here will be a month worth of work, and quite difficult as well, since this and ChaosTools are already becoming libraries of significant size.

Axis 1: DS constructor

Choice 1a: Optionally accept only Jacobian function
Choice 1b: Optionally accept only tangent dynamics function
Choice 1c: Optionally accept both Jacobian and tangent dynamics functions

Axis 3: Interface for calling Jacobian and tangent dynamics functions

Choice 3a: Store generated jacob! and tangent_dynamics! (current implementation).
Choice 3b: Provide methods to call them; e.g., something like tangent_dynamics!(..., ds) calls ds.tangent_dynamics! if it exists and otherwise generated version.

Option 3b seems reasonable, plus there is no limitation in adding more fields to the fundamental types. But this Axis depends strongly on if and how can somebody use all these methods that Chris mentioned.

Axis 2: DS struct design

(If Choice 3b below is taken, it's about how to represent missing Jacobian and tangent dynamics functions. If Choice 3a is taken, there is only one choice: 2a. I guess that's what you mean by "no-go"; i.e., 3b was no-go.)
Choice 2a: Use the same struct. Let them be undefined and use isdefined(ds, :jacob!) etc.
Choice 2b: Use the same struct. Let them be undefined but store the availability information in type parameter.
Choice 2c: Define different struct types.
I prefer Choice 2b or 2c. Choice 2b could buy us some performance benefit over 2a since we can eliminate isdefined calls.

I could not agree less with 2c. More structs is much, much, much harder to maintain. 2b seems the best option in my eyes. In fact, 2b already happens in some sense since the systems are parameterized on the function type.

@ChrisRackauckas

I think we are all a bit confused? (definitely me! :P )

But you could make use of the more general IMEX form: http://docs.juliadiffeq.org/latest/solvers/split_ode_solve.html#Implicit-Explicit-(IMEX)-ODE-1 where the Jacobian would then be A if that's the stiff (implicit) side.

As takafumi mentioned, the problem we are trying to solve is:

dx/dt = f(t, x)
dM/dt = A(t, x) * M

where the second equation is linear in the sense that A is a matrix.

In short, there are two equations that we want to evolve in parallel. The page you linked adds them using the plus operation, something that we cannot do here.

But maybe I can't see how to use it.

In short here is my full reply:

The current set-up we have is stable, works well, it is intuitive from a user perspective, it is intuitive from the maintainer's perspective. Also, it is already done, noone has to code something about it, it already exists :P
You cannot ask from the user to give something more than the equations of the Jacobian (so if tangent_dynamics the way it is discussed here happens, it must happen completely internally). EDIT: You can give the users the possibility to give tangent_dynamics, this is of course fine. But you must not have it as a requirement!
(Please correct me if I am wrong here) The only benefits coming from the current proposals are minor performance benefits. The high-level API wont be affected since, once again, you cannot ask the user to start creating tangent dynamics without the possibility of just giving a Jacobian. From mathematics and physics perspective you only need the vector field and the Jacobian.

For the above reasons, I cannot agree that the above is a good idea, with the currently presented benefits.

If however, e.g. @tkf still wants to try it I will gladly review and give helpful comments. This is in the the end an open sourced project!

Disclaimer: If someone tries it, then you have to do it with functions, not structs. It makes the code more readable, more maintanable, easy to be developed (functions can be re-written, structs not) and more extendable, as functions can have methods added to them via multiple dispatch. Maybe a single struct will be necessary, but more than that are absolutely unnecessary.

Disclaimer n2 : this is one of the very few proposals that I cannot guarantee 100% that will get merged.

The thing to really consider though before doing it is: "Is it actually worth it? Which are the benefits?" In my eyes this seems like an crazy amount of work for minuscule benefits. I may not have understood completely the benefits so @tkf you are welcome to write a short reply with the benefits you see.

Notice that both lyapunovs and gali are already seriously fast, and for "typical" values they take around a milisecond. For really long times they take around a second.

ChrisRackauckas commented 6 years ago

In short, there are two equations that we want to evolve in parallel. The page you linked adds them using the plus operation, something that we cannot do here.

Oh, that's a different kind of splitting that I haven't looked at in detail. But you can do it yourself with two separate integrators that are time step linked. I am looking into this more for the kind "stiff part vs non-stiff part" integration splitting, but hey if it works here too that's cool.

We do have some methods we are producing specifically for linear systems though. Basically you can re-write some Runge-Kutta methods onto matrix exponentials and you get something great for time/state-dependent linear systems. Tests for the lowest order ones already exist

https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/blob/master/test/linear_method_tests.jl

They aren't done yet. We need "better" ones before it's release worthy.

tkf commented 6 years ago

Thanks @Datseris, for clarifying what we need for ODE solver. I opened an issue elsewhere: https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/issues/250

... going back to the original discussion.

The thing to really consider though before doing it is: "Is it actually worth it? Which are the benefits?"

Well, you said you'd support tangent_space so at least you need to add a field to existing structs or add other structs to hold it, right?

If someone tries it, then you have to do it with functions, not structs.

Isn't current API contradicting to it? You are exposing fields as the public API: https://juliadynamics.github.io/DynamicalSystems.jl/latest/definition/continuous/

What you are saying is more compatible with my suggestion 3b and the reason why I preferred it is pretty much explained by you in the same paragraph. So, about this:

(Please correct me if I am wrong here) The only benefits coming from the current proposals are minor performance benefits. The high-level API wont be affected since, [...]

No. It also includes high-level API improvements. For example, with Choice 3b, you can provide methods (not fields) eom and eom! for both BigDiscreteDS and DiscreteDS. This let users to write single simpler code calling eom/eom! that works for any dynamical system type.

It also improves maintainability, as you pointed out. In fact, if you choose Choice 3b, Axis 2 is hidden from users and simply become the implementation details. This means that you can change one implementation to another without breaking users code (provided that they only use documented public APIs).

[...] once again, you cannot ask the user to start creating tangent dynamics without the possibility of just giving a Jacobian. From mathematics and physics perspective you only need the vector field and the Jacobian.

I never suggest to make tangent_space a mandatory argument to the dynamical system type constructor. In fact, Axis 3 was all about hiding that details from the users and provide a unified interface. Also, for researchers working on high-dimensional systems, I believe that efficient Jacobian-free API is mandatory.

Anyway, probably I should stop rambling about it. I was about to write down my thoughts on other perspectives of this library in another issue but it may not be very constructive. It seems that you are confident about the current design and probably this is not the best time if you are busy...

Datseris commented 6 years ago

Isn't current API contradicting to it? You are exposing fields as the public API: https://juliadynamics.github.io/DynamicalSystems.jl/latest/definition/continuous/

Absolutely not. Users define one, single struct that can then be passed to functions via the power of multiple dispatch. All internal code is done through functions, and there is a single struct for each system type, since it is not possible to have one magic struct for all system types.

I was referring to the internals, that they should be functions and not structs. For example, having 10 structs simply to do the transient evolution of a system is simply not maintainable.

No. It also includes high-level API improvements. For example, with Choice 3b, you can provide methods (not fields) eom and eom! for both BigDiscreteDS and DiscreteDS. This let users to write single simpler code calling eom/eom! that works for any dynamical system type.

This comes out of nowhere... Also, how is it related with the current topic? The current topic is about having a combined structure for the Jacobian and equations of motion, so that the Jacobian is not recomputed in the variation-vector methods like lyapunovs. Merging DiscreteDS and BigDiscreteDS really comes out of nowhere for me!

Then again, how can this actually happen? I would really appreciate an example, because I cannot just off the top of my head imagine how this works... A user has to give you the equations of motion. How are you suggesting that we can provide a method eom instead? I think I am completely lost :D

In fact, Axis 3 was all about hiding that details from the users and provide a unified interface. Also, for researchers working on high-dimensional systems, I believe that efficient Jacobian-free API is mandatory.

True, once again we agree on this point. Both statements are really nice, to a "hide" details using well defined functions and to also provide performant auto-Jacobian. As I said many times, the reason this issue is open is because I agree that solving it will be an improvement.

Anyway, probably I should stop rambling about it. I was about to write down my thoughts on other perspectives of this library in another issue but it may not be very constructive.

Woooo hold on there buddy! As you clearly saw from your recent PR, if you do something constructive that improves this ecosystem, it firstly gets as good as possible feedback and it gets merged smoothly, having support from the developers. In many cases other people would say "do this first" instead of just co-committing with you.

There is a problem with just saying stuff though: the person reading them has no idea how these could be implemented and how they could affect the rest of the code. Take for example the current issue. For 90% of your suggestions I have no clue how they will affect things, since there is no clarity on what do they mean code-wise. In addition, I have no clue how to actually implement them.

That is why I urge to give a code example. Just open a small PR?

It seems that you are confident about the current design and probably this is not the best time if you are busy...

Of course I am confident, and the reason is that I am backed with the /test folder of each repo. Why shouldn't I be? The claims I made, namely:

The current set-up we have is stable, works well, it is intuitive from a user perspective, it is intuitive from the maintainer's perspective. Also, it is already done, noone has to code something about it, it already exists :P

are all true. Can you disprove any of them? Please go ahead and try. You seem to constantly be misjudging my words, so let's hope this can make it clear to you: The reason the current issue is open is because I absolutely agree that it is an improvement.

What I stated clearly is that I think this issue is not worth my time. Not yours, mine. I think I was 100% clear about it. If you think it is worth it for you to solve, please go ahead, it will be appreciated! As I mentioned above, since you believe all this changes are so good, why not do a small PR that starts showing them off? How can I believe you if I don't even know how what you say can be done in code? A PR is just 1000 times more helpful than just writing essay long comments. They only invoke essay long replies like this one. If the changes are truly good (and truly understandable, because remember: clear source code) then they will get merged without a doubt. Your first PR already proved this.

If you want to discuss stuff, and say stuff, please use gitter, it is much, much, much more suited. These essay long comments, both yours and mine, are leading nowhere.

tkf commented 6 years ago

Users define one, single struct that can then be passed to functions via the power of multiple dispatch.

I think you are confused. By struct defined by (end-) user, you are talking about the parameter of the dynamical system, like the p parameter to ODEProblem, right? I'm not talking about that.

All internal code is done through functions, and there is a single struct for each system type, since it is not possible to have one magic struct for all system types.

I think you are including ChaosTools.jl in "internal". I guess that's another confusion. We are in the issue of DynamicalSystemsBase.jl. Any packages whose REQUIRE includes DynamicalSystemsBase are also the users (in my terminology anyway; we can call them downstream libraries or something, if that's too confusing for you). For example, here is one function from ChaosTools.jl:

function lyapunovs(ds::BigDiscreteDS, N::Real; Ttr::Real = 100)
    # ... snip ...
    for i in 1:N
        # ... snip ...
        ds.eom!(u, ds.dummystate, ds.p)
        ds.jacob!(J, u, ds.p)
        # ... snip ...
    end
    λ./N
end

I said "You are exposing fields as the public API" because the user (= ChaosTools.jl) is accessing the fields .eom! and .jacob!. If your public API is via function (as in Choice 3b), the code will be like this:

function lyapunovs(ds::BigDiscreteDS, N::Real; Ttr::Real = 100)
    # ... snip ...
    for i in 1:N
        # ... snip ...
        eom!(u, ds.dummystate, ds)
        jacob!(J, u, ds)
        # ... snip ...
    end
    λ./N
end

This API is much more robust, since you can modify the struct's internal structure while providing the old API (and maybe adding deprecated warning if you want to get rid of the old ones). (OK, now that getproperty and setproperty! are coming, this may not a big the issue in the future. But still, it's tricky to keep backward compatibility and performance.)

The above argument is about .eom! and .jacob! but the same is true for tangent_dynamics!.

(OK. To be complete, I also should point out that ds.dummystate is not going into the same direction as "don't use field as public API". Actually there are other more important reasons why having the temporary variable ds.dummystate as dynamical system struct is bad. But this is not the topic of this issue.)

I hope now it's clear why I started talk about eom! and eom. If you have function definitions like this:

eom!(u_new, u, ds::BigDiscreteDS) = ds.eom!(u_new, u, ds.p)
eom!(u_new, u, ds::DiscreteDS) = (u_new[:] = ds.eom(u_new, u, ds.p))

and expose them as public API, then the users of DynamicalSystemsBase.jl can define function like this:

function make_noisy_ts(ds, u0, steps)
    us = zeros(eltype(u0), length(u0), steps)
    us[:, 1] = u0
    @views for t in 2:steps
        eom!(us[:, t], us[:, t-1] + make_noise(), ds)
    end
    return us
end

which works for all dynamical system types. Of course, those eom variant can be useful too:

eom(u, ds::BigDiscreteDS) = ds.eom!(similar(u_new), u, ds.p)
eom(u, ds::DiscreteDS) = ds.eom(u_new, u, ds.p)

You can also do everything with defining all possible functions as closures and set them to the field (or use getproperty) but it does not scale. It's not very idiomatic Julia style because it can be done by multiple dispatch (as I'm arguing here). Also, I don't think setting closure to the property work well with help system (i.e., ?ds.jacob! does not show the help) at the moment, although maybe it'll be fixed at some point.

It seems that you are confident about the current design and probably this is not the best time if you are busy...

Of course I am confident, and the reason is that I am backed with the /test folder of each repo.

This also tells me that our communication was lossy. I was talking about API design and you argue back with a comment on implementation. I was never trying to say there are bugs.

I also don't understand you are recommending a PR. I'm trying to help you coming up with a good API design. Before API is settled, writing code can be a complete waste of time. But I agree that code examples are useful so I added some Julia(-ish) code.

I'll check out gitter.

Datseris commented 6 years ago

Oh man, this comment made soooooo many things much more clearer. That's why gitter is much more suited for these kind of discussions :D because conversation is much easier when there is interactive discussion!

I have to say that I find almost everything you said very nice ideas!

I hope now it's clear why I started talk about eom! and eom. If you have function definitions like this:
eom!(u_new, u, ds::BigDiscreteDS) = ds.eom!(u_new, u, ds.p)
eom!(u_new, u, ds::DiscreteDS) = (u_new[:] = ds.eom(u_new, u, ds.p))
and expose them as public API, then the users of DynamicalSystemsBase.jl can define function like this:
function make_noisy_ts(ds, u0, steps)
    us = zeros(eltype(u0), length(u0), steps)
    us[:, 1] = u0
    @views for t in 2:steps
        eom!(us[:, t], us[:, t-1] + make_noise(), ds)
    end
    return us
end

yeap seems valid. Of course when the decision is made for a change like this we have to just put a bit more thought to make sure everything will be very fine in the end, but I think all of the suggestions are good.

Well to my defence, I have coded everything by myself so far, so I did not had the advantage of a second perspective, which as is evident now can be very advantageous.

tkf commented 6 years ago

I'm glad that you get it. I should have posted examples earlier!

I'll open gitter when discussing complicated stuff but I live in the US west coast so I guess our timezone don't overlap much.

Datseris commented 6 years ago

Should we break this into smaller and simpler issues, and tackle one at a time?

Datseris commented 6 years ago

(OK. To be complete, I also should point out that ds.dummystate is not going into the same direction as "don't use field as public API". Actually there are other more important reasons why having the temporary variable ds.dummystate as dynamical system struct is bad. But this is not the topic of this issue.)

Yeah I also think it is bad, but I also cannot think of another way at the moment. If you can, you should open an issue and suggest a way. I don't like it but I can't do better either.

Datseris commented 6 years ago

I have managed to implement this in #23 commit 1d9af0e9e815b115ba30338d66adc0e64bfa1952 for discrete systems!!!!! Hooray.

@tkf I do not know how to do it for Continuous though... Care to share code that does it?

tkf commented 6 years ago

I define PhaseTangentDynamics here which works for both ODEProblem and DiscreteProblem: https://github.com/tkf/LyapunovExponents.jl/blob/master/src/coevolve.jl

Then I just do something like:

phase_prob :: DEProblem
phase_dynamics! = phase_prob.f
tangent_dynamics! = PhaseTangentDynamics(phase_dynamics!, phase_prob.u0)
u0 = ...
tangent_prob = ODEProblem(  # or DiscreteProblem
    get_tangent_dynamics(prob),
    u0,
    phase_prob.tspan,
    phase_prob.p,
)

(Some detail: phase_prob.u0 passed to PhaseTangentDynamics is used for just extracting type information for pre-allocate memory to store Jacobian matrix.)

PhaseTangentDynamics works only for out-of-place at the moment.

Feel free to copy coevolve.jl, if you need :) It's probably a good idea to maintain it (or similar utility) in DynamicalSystemsBase.jl so that people (including me) can just import it to their library.

Datseris commented 6 years ago

Ah alright, then it is identical to how I did it with the discrete system.

(are you aware that the way you define PhaseTangentDynamics has type instability issues?)

Datseris commented 6 years ago

The implementation that solves this is described here: https://github.com/JuliaDynamics/DynamicalSystemsBase.jl/issues/20

tkf commented 6 years ago

PhaseTangentDynamics has type instability

Ah, you mean because PhaseTangentDynamics.jacobian is Any? I forgot to put a type signature there.

JuliaDynamics / DynamicalSystemsBase.jl

Proposal: don't re-compute the vector field and map twice when using ForwardDiff #17

Axis 1: DS constructor

Axis 2: DS struct design

Axis 3: Interface for calling Jacobian and tangent dynamics functions

18 is already underway, for better or worse :P any change discussed here will happen after it. It does not matter though, since #18 truly is: "change (t, u, du) to (du,u,p,t) and make sure every single thing works".

18 is already underway, for better or worse :P any change discussed here will happen after it. It does not matter though, since #18 truly is: "change `(t, u, du)` to `(du,u,p,t)` and make sure every single thing works".