nansencenter / DAPPER

Data Assimilation with Python: a Package for Experimental Research
https://nansencenter.github.io/DAPPER
MIT License
341 stars 119 forks source link

Initial SDE Code Pull Request #76

Closed cgrudz closed 3 years ago

cgrudz commented 3 years ago

Hey Y'all,

Re #70, apologies for delays on this, I am finally getting back around to merging some changes that have been sitting for a while. In this request, I have added the general Runge-Kutta SDE scheme with a fairly simple change to the existing version in the integration schemes in mods. This includes adding an "s" parameter for the instantaneous standard deviation of the noise (the diffusion coefficent) assuming additive noise, and re-writes "order" into a "stages" parameter. This has to due with the discrepancy between the number of stages and the order of the method for the SDE models. This was tested versus the test cases and I found that there was only an issue with a special case in bocquet19, and otherwise the re-naming seems to have been propagated successfully.

I am still working on the full set up of the L96s model on which these changes depend to a certain extent, but I should have this model integrated in the coming weeks as the next stage. This will help me get more into DAPPER development now that I'm preparing my course materials consistently.

Cheers, Colin

cgrudz commented 3 years ago

Definitely want to add the example, the trick is that this requires having two distinct time-steppers for the ensemble and the the truth twin respectively.  Is there any example where this is performed elsewhere in DAPPER?  I'll incorporate the additional suggestions below later today, along with a rough draft of the example and edits I've been working on, add this all to the pull request.

Otherwise, strong / weak convergence refers to whether we are referring to convergence in path or convergence in distribution.

On 7/15/21 2:02 AM, Patrick N. Raanes wrote:

@.**** requested changes on this pull request.

Looks good, although I have marked/suggested/discussed some changes.

One thing that's missing is a script demonstrating these features, ideally reproducing something. Is this something you are thinking to add?


In .gitignore https://github.com/nansencenter/DAPPER/pull/76#discussion_r670241180:

+############################## +### Vim ### +############################## +*.swp + ##############################

IMHO this belongs in |$HOME/.gitignore|


In dapper/mods/integration.py https://github.com/nansencenter/DAPPER/pull/76#discussion_r670255666:

  • The rule has strong / weak convergence order 1.0 for generic SDEs and order 4.0
  • convergence for ODEs when stages=4. For stages=1, this becomes the Euler-Maruyama
  • schemefor SDEs (s > 0.0) with strong / weak convergence order 1.0 for SDEs with
  • additive noise as defined in the below. See bib.grudzien2020numerical. ⬇️ Suggested change
  • The rule has strong / weak convergence order 1.0 for generic SDEs and order 4.0
  • convergence for ODEs when stages=4. For stages=1, this becomes the Euler-Maruyama
  • schemefor SDEs (s > 0.0) with strong / weak convergence order 1.0 for SDEs with
  • additive noise as defined in the below. See bib.grudzien2020numerical.
  • This scheme has order-4 convergence for ODEs when stages=4.
  • For SDEs with additive noise as defined below, it has strong / weak convergence
  • of order 1.0 (for any value of stages). See bib.grudzien2020numerical.

Also, when is it strong, and when is it weak ?

Please also add a note saying that the stochastic integrator should not be used if the DA method has its own way of dealing with the noise, such as for gaussian mixtures, particle filters, and the classic/extended KF.


In dapper/mods/L96s/init.py https://github.com/nansencenter/DAPPER/pull/76#discussion_r670258103:

+######################################################################################## +########################################################################################

Sorry, I want more style conformity. Please use a box-like approach

|####################### # This is a heading # ####################### |

For smaller sub-headings use |# -------------------------|


In dapper/mods/L96s/init.py https://github.com/nansencenter/DAPPER/pull/76#discussion_r670265512:

+def dxdt_autonomous(x):

  • return (shift(x, 1)-shift(x, -2))*shift(x, -1) - x
  • +def dxdt(x):

  • return dxdt_autonomous(x) + Force
  • +def step(x0, t, dt):

  • return rk4(lambda t, x: dxdt(x), x0, np.nan, dt)
  • +######################################################################################## +# 2nd order strong taylor SDE step

  • +def l96s_tay2_step(x, t, dt):

Actually, maybe you do not need to create |dapper/mods/L96s| but rather just add a file to |dapper/mods/Lorenz96| that is called |stoch_integrator.py| (for example) containing |l96s_tay2_step| and nothing more?

Or am I missing something? Please discuss hereunder.


In dapper/mods/integration.py https://github.com/nansencenter/DAPPER/pull/76#discussion_r670268144:

  • The diffusion coeffient for models with additive noise. Default: 0 for
  • deterministic integration. ⬇️ Suggested change
  • The diffusion coeffient for models with additive noise. Default: 0 for
  • deterministic integration.
  • The diffusion coeffient for models with additive noise.
  • Default: 0, yielding deterministic integration.

In dapper/mods/integration.py https://github.com/nansencenter/DAPPER/pull/76#discussion_r670268917:

  • The number of stages of the RK method. Default: 4. When stages=1, this becomes
  • Euler / Euler-Maruyama. ⬇️ Suggested change
  • The number of stages of the RK method. Default: 4. When stages=1, this becomes
  • Euler / Euler-Maruyama.
  • The number of stages of the RK method.
  • When stages=1, this becomes the Euler (-Maruyama) scheme.
  • Default: 4.

In dapper/mods/integration.py https://github.com/nansencenter/DAPPER/pull/76#discussion_r670269335:

Returns

ndarray State vector at the new time, t+dt """

  • if order >=1: k1 = dt * f(t , x) # noqa
  • if order >=2: k2 = dt * f(t+dt/2, x+k1/2) # noqa
  • if order ==3: k3 = dt f(t+dt , x+k22-k1) # noqa
  • if order ==4: # noqa
  • k3 = dt * f(t+dt/2, x+k2/2) # noqa
  • k4 = dt * f(t+dt , x+k3) # noqa
  • if order ==1: return x + k1 # noqa
  • elif order ==2: return x + k2 # noqa
  • elif order ==3: return x + (k1 + 4*k2 + k3)/6 # noqa
  • elif order ==4: return x + (k1 + 2*(k2 + k3) + k4)/6 # noqa
  • else: raise NotImplementedError # noqa
  • if s > 0.0:

Please put the simplest case (|s==0|) up top.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nansencenter/DAPPER/pull/76#pullrequestreview-707060988, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABXIPFTFWWCLWFWJEU4CRDTTX2P4BANCNFSM5AMNRCYQ.

patnr commented 3 years ago

the trick is that this requires having two distinct time-steppers for the ensemble and the the truth twin respectively. Is there any example where this is performed elsewhere in DAPPER?

I think this will do it (in the case of examples/basic_1.py):

@@ -15,6 +15,8 @@ import dapper.da_methods as da
 # #### Load experiment setup: the hidden Markov model (HMM)

 from dapper.mods.Lorenz63.sakov2012 import HMM
+HMM2 = HMM.copy()
+HMM2.step = my_custom_step_function

 # #### Generate the same random numbers each time this script is run

@@ -35,7 +37,7 @@ xp = da.EnKF('Sqrt', N=10, infl=1.02, rot=True)

 # #### Assimilate yy, knowing the HMM; xx is used to assess the performance

-xp.assimilate(HMM, xx, yy, liveplots=not nb)
+xp.assimilate(HMM2, xx, yy, liveplots=not nb)

 # #### Average the time series of various statistics
patnr commented 3 years ago

BTW, I suggest reviewing the changes on GitHub rather than by email. You get pretty coloring, and my up-to-date comments (often edited for tone :rofl: )

cgrudz commented 3 years ago

Ahahahaha sounds good, I'm just working from home at the moment while I take care of some things, once I get to the office later I'll get back into serious dev mode.

On 7/15/21 8:40 AM, Patrick N. Raanes wrote:

BTW, I suggest reviewing the changes on GitHub rather than by email. You get pretty coloring, and my up-to-date comments (often edited for tone 🤣 )

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nansencenter/DAPPER/pull/76#issuecomment-880801329, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABXIPFVKTBW2FL4YTITU3QLTX36O7ANCNFSM5AMNRCYQ.

cgrudz commented 3 years ago

Good point, I'll have a look at this to see what redundancy can be eliminated.

On 7/15/21 12:22 PM, Patrick N. Raanes wrote:

@.**** commented on this pull request.


In dapper/mods/L96s/init.py https://github.com/nansencenter/DAPPER/pull/76#discussion_r670744966:

+def dxdt_autonomous(x):

  • return (shift(x, 1)-shift(x, -2))*shift(x, -1) - x
  • +def dxdt(x):

  • return dxdt_autonomous(x) + Force
  • +def step(x0, t, dt):

  • return rk4(lambda t, x: dxdt(x), x0, np.nan, dt)
  • +######################################################################################## +# 2nd order strong taylor SDE step

  • +def l96s_tay2_step(x, t, dt):

Got you, I think. You still might be able to avoid duplicating |Lorenz96.py/extras.py| and large parts of |init.py| by importing them from |Lorenz96| into |L96s|, I belive.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nansencenter/DAPPER/pull/76#discussion_r670744966, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABXIPFUEVVSXVQYJYL7IDSDTX4YQNANCNFSM5AMNRCYQ.

cgrudz commented 3 years ago

the trick is that this requires having two distinct time-steppers for the ensemble and the the truth twin respectively. Is there any example where this is performed elsewhere in DAPPER?

I think this will do it (in the case of examples/basic_1.py):

@@ -15,6 +15,8 @@ import dapper.da_methods as da
 # #### Load experiment setup: the hidden Markov model (HMM)

 from dapper.mods.Lorenz63.sakov2012 import HMM
+HMM2 = HMM.copy()
+HMM2.step = my_custom_step_function

 # #### Generate the same random numbers each time this script is run

@@ -35,7 +37,7 @@ xp = da.EnKF('Sqrt', N=10, infl=1.02, rot=True)

 # #### Assimilate yy, knowing the HMM; xx is used to assess the performance

-xp.assimilate(HMM, xx, yy, liveplots=not nb)
+xp.assimilate(HMM2, xx, yy, liveplots=not nb)

 # #### Average the time series of various statistics

Ah ha, so it seems to be working so far, but I think I notice now a possible issue with the next step. Part of the configuration that I'm trying to build requires that the model twin is actually using a more coarse discretization in time, e.g.,

In this case, these align on the same absolute times when observations are given, but these have different partitions of the time-interval for the dynamic propagation.

I'll look into this to see if there's a good way to implement this with minimal invasion to the code, this can probably be made to work within the example scrip itself if I just end up needing to subset the finer discretization.

patnr commented 3 years ago

Closed in favour of #77