SciML / DiffEqParamEstim.jl

Easy scientific machine learning (SciML) parameter estimation with pre-built loss functions
https://docs.sciml.ai/DiffEqParamEstim/stable/
Other
60 stars 34 forks source link

Making DiffEqParamEstim.jl more friendly for first time users #110

Closed freemin7 closed 1 year ago

freemin7 commented 5 years ago

Hello

i looked for something similar to parameter estimation and was mislead by DiffEqFlux to think it was the prime way to do parameter fitting for ODEs. Frustrated by the lack of multiple shooting support, i tried to convince DiffEqFlux to let me differentiate it in regard to it's inital conditions. I failed to do so even with excellent support ( https://discourse.julialang.org/t/tracking-initial-condition-to-optimize-starting-value-too-diffeqflux/26596 ). DiffEqParamEstim seems to have built-in support for multiple shooting and nod ODE specific stuff.

Would it be possible to provide a small self contained example how to use DiffEqParamEstim.jl?

It does not need to be a whole blog post like https://julialang.org/blog/2019/01/fluxdiffeq in fact i would find a lightly commented example more helpful as i am unable to grasp the current documentation of DiffEqParamEstim.jl .

Thank you very much.

atrophiedbrain commented 5 years ago

I had good luck getting started by thoroughly reading this page.

I found this script very helpful as well as it provides example calls to many local and global optimizers.

If you provide an idea of the dimensions of your problem or maybe provide a function definition, I'd be happy to help get you started.

freemin7 commented 5 years ago

My problem with that documentation was that i got stuck trying to understand how to build my own loss function. In the problem i was working on i had good success with this core: for i=1:59 c += (solution(i)[1]/real_data[i]-1)^2 ## L2 of relative error end Since my solution spanned multiple magnitudes and the bias about the larger influence of the larger values worried me. The function keyword did not help making me unsure whether i had to call this function or define my own build_loss_objective which passes all these criteria.

Other points of further confusion were

On a positive note, the part about local vs global solution of the lotka volterra was well written. I ran into global vs local solution problems during playing with the initial fluxdiffeq example. This motivated me to change the problem which lead to the above thread seeing that DiffEqFlux had no tooling to handle getting stuck in a local minima and i am not familiar with global optimization.

ChrisRackauckas commented 5 years ago

Yes, these docs need a rewrite.

the repeated restarting of an enumeration about Multiple Shooting objective

I don't get the question here.

How the differencing is implemented (symmetric vs right side finite difference), how it handles data in irregular intervals

Just forward. It's just to get simple data on the first derivatives.

If there are higher order differing methods

No

What a regularization function does in this context. (Half a sentence would have helped)

It's just a regularization term in the loss function, so adding the L2 norm of the current parameters for example.

If with prob_generator it is possible to do both parameters and initial conditions

Yes, prob_generator(prob,p) = remake(prob,u0=p[1:10],p=p[11:15])

Here we used VectorOfArray from RecursiveArrayTools.jl to turn the result of an ODE into a matrix. -> Why?

As training data where we know the solution.

freemin7 commented 5 years ago

Yes, these docs need a rewrite. That is good to realize.

I don't get the question here.

`Multiple Shooting is generally used in Boundary Value Problems (BVP) and is more robust than the regular objective function used in these problems. It proceeds as follows:

1. Divide up the time span into short time periods and solve the equation

with the current parameters which here consist of both, the parameters of the differential equations and also the initial values for the short time periods.

1. This objective additionally involves a discontinuity error term that imposes

higher cost if the end of the solution of one time period doesn't match the beginning of the next one.

  1. Merge the solutions from the shorter intervals and then calculate the loss.

[...]` Starting over from 1. each time makes this section very confusing for me.

Just forward. It's just to get simple data on the first derivatives. That should be a straightforward change to make. I will look into it.

What a regularization function does in this context. (Half a sentence would have helped)

It's just a regularization term in the loss function, so adding the L2 norm of the current parameters for example. Ah and it's motivation is to prevent "extreme values" in the parameter space. A sentence like that would help people who have not encountered this concept in optimization.

ChrisRackauckas commented 1 year ago

The library was revamped to use Optimization.jl which makes it a lot more sane. A new docs is coming. If any issues after the new docs, please file a new issue that's more speceific.