CliMA / OceanParameterizations.jl

Machine learning (and uncertainty quantification?) of climate model parameterizations using differentiable (and probabilistic?) programming.
MIT License
21 stars 2 forks source link

Useful modeling tools: Field abstraction, staggered grid functionality #38

Open glwagner opened 3 years ago

glwagner commented 3 years ago

I think OceanParameterizations.jl probably has the potential to supercede OceanTurb.jl, since in a sense its requirements are a subset of what OceanTurb provides. I think it'd be amazing if the package offered practical, useful tools for column modeling, parameterization development and differentiation-based optimization and UQ.

To become a useful modeling package, I think OceanParameterizations needs something similar to OceanTurb / Oceananigans's Field abstraction. We also want composable operators for differentiating / interpolating fields on a staggered grid, and a template for implementing PDEs that hooks into DifferentialEquations for time integration. Some aspect of these abstractions must already exist given the results I've seen, but I'm not sure how mature the functionality is.

We could even just import Oceananigans functions if we are ok hard-coding three-dimensionality. Since the ocean is 3D, I think this is ok, if slightly inconvenient for column modeling. One may want to optimize 3D parameterizations eventually (eg mesoscale closures, etc...)

Along the way we want to ensure that models remain differentiable and machine-learnable.

If we can build a simple Field abstraction and PDE template, then we can probably port both KPP and OceanTurb's TKE-based closure into OceanParameterizations without too much pain.

I'm interested in people's thoughts --- would this be welcome, unwelcome? What is already present in the package that could be leveraged? What might have to be re-written?

@navidcy might be interested in this too, and @kburns if he wants to train NN's for two-dimensional turbulent flows during time-integration (and can stomach using FV).

Some notes

Fields and staggered grid operators are implemented in OceanTurb, but because the implementation pre-dates Oceananigans, it is a bit quirky and slightly less usable than Oceananigans Fields and operators. OceanTurb also suffers because it doesn't use DifferentialEquations. It's PDE template is probably approximately what we want, however (a diffusion term, an advection term, and a source term, with tools for treating diffusion + advection implicitly during time-integration).

glwagner commented 3 years ago

@ali-ramadhan what are your thoughts?

ali-ramadhan commented 3 years ago

I generally agree that OceanParameterizations.jl can be a home/playground for developing ocean parameterizations in Julia and I'd be interested in working together to make it happen.

Right now I feel the repository is kinda undergoing growing pains with the OceanParameterizations module not actually containing all that much except a few useful shared utils. Most development is happening in separate environments, e.g. wind_mixing and free_convection where most of the code lives. I think @adelinehillier, @xkykai, and I kinda settled on this workflow for now as we figure out which shared utils should live in OceanParameterizations (#26). There's also the practical issue that scientific machine learning workflows have pretty heavy dependencies (#24) so we're trying to keep the OceanParameterizations module as lightweight as possible for now.

All this to say I think things are still being figured out but if the goal is to develop a useful playground for ocean parameterization development then this can help guide the development of the OceanParameterizations module.


I think the post-processing frameworks of Oceananigans and OceanTurb are at odds with the current AD capabilities of Julia because Zygote doesn't support mutating arrays (unless we wanted to go down the rabbit hole of defining custom rules for Oceananigans and OceanTurb with ChainRules.jl). There is a new exciting AD package coming out soon(?) called Diffractor.jl but I don't think support for mutation is coming anytime soon so we might be somewhat stuck with two different frameworks:

  1. a fast and efficient non-allocating post-processing framework for Oceananigans.jl and OceanTurb.jl, and
  2. an allocating post-processing framework that is AD friendly.

An example of how we've implemented a simple operation: instead of using ∂z or ∂zᵃᵃᶜ, we construct a differentiation matrix that operates on a vector v to return ∂z(v): https://github.com/CliMA/OceanParameterizations.jl/blob/fdb5b9519fa8d2012b82cb4a5d01208b8bd41d71/src/differentiation_operators.jl#L1-L14

This is a very barebones use, i.e. no halos or boundary condition options. We may be able to link the two frameworks a bit more, e.g. construct Dᶜ using Oceananigans.Operators.

I guess to me it feels like full AD support right now requires an entirely different slow and memory-allocating framework, which is completely at odds to the post-processing frameworks of Oceananigans and OceanTurb.

Do we think full AD support is worth the sacrifice?

An unfortunate consequence is that you might need to implement a parameterization twice, one AD friendly one for training then another fast one for actual use (once parameters have been inferred following training). For parameterizations like KPP this might be a huge implementation and maintenance burden.

Training neural differential equations requires AD and once it's trained it's easy to embed in e.g. Oceananigans so the sacrifice seems worth it here, but I'm less sure about e.g. KPP.

ali-ramadhan commented 3 years ago

I should also say that a "memory-allocating implementation" may be slower but may not be too slow for development purposes, i.e. not too slow for non-AD approaches like Ensemble Kalman Inversion.

The "very fast and efficient non-allocating version" is probably only needed once the parameterization is embedded into a GCM where performance is crucial. But perhaps this fast implementation only needs to happen once lots of training has been done and it's been confirmed that the parameterization is worth implementing and embedding into a GCM?

glwagner commented 3 years ago

I think there are a few applications that require efficient implementations (MCMC being the only one I can think of now) but they are in the minority. I think slow is ok, especially in 1D. Certainly for most forward modeling applications speed isn't that important in 1D if its the difference between integration finishing in a 1/10th of a second or 2 seconds.

Can we come up with a design that separates the specification of a parameterization from whether or not it is implemented / time-integrated with a mutating / non-allocating algorithm or an allocating algorithm? With such a design we can worry first about AD-able methods and then implement an optimized non-allocating algorithm later if we need to.

ali-ramadhan commented 3 years ago

I think there are a few applications that require efficient implementations (MCMC being the only one I can think of now).

True. Although the nice thing about MCMC methods is you can generally run N >> 1 chains in parallel (trivial since chains are independent) then merge the samples accepted by all the chains.

Can we come up with a design that separates the specification of a parameterization from whether or not it is implemented / time-integrated with a mutating / non-allocating algorithm or an allocating algorithm?

I think so. One concrete example I can think of is that if the parameterization is time-stepped with DifferentialEquations.jl, then we just need to define a function for the right-hand-side like f(u, p, t) and the non-allocating version would just be f!(du, u, p, t). Both can be passed to ODEProblem so the rest of the interface (parameterization set up, parameter specification, analysis/visualization of the solution) all remains the same no matter whether you're using f or f!.

glwagner commented 3 years ago

I think so. One concrete example I can think of is that if the parameterization is time-stepped with DifferentialEquations.jl, then we just need to define a function for the right-hand-side like f(u, p, t) and the non-allocating version would just be f!(du, u, p, t). Both can be passed to ODEProblem so the rest of the interface (parameterization set up, parameter specification, analysis/visualization of the solution) all remains the same no matter whether you're using f or f!.

Hmm yeah.

We can prototype the design on the diffusion equation with a variable diffusivity / convective adjustment problem.