thiery-lab / data-assimilation

Python code for data assimilation methods
MIT License
45 stars 14 forks source link

Data assimilation in Python

Python code for data assimilation inference methods and test models.

The models implemented include

The inference methods implemented include

Example usages of the models and inference methods are provided in the Jupyter notebooks in the notebooks directory.

Dependencies

The dapy package is intended for use with Python 3.8+. We recommend using a tool such as Conda to create an isolated Python 3 virtual environment to install the package and its dependencies in. The minimal requirements for using the inference methods and model classes implemented in the dapy package are NumPy, SciPy and Numba. Appropriate versions of these packages will be installed automatically when installing the package using pip.

To install the depedencies manually in a Conda environment run

conda install numpy scipy numba

or using pip

pip install numpy scipy numba

The ensemble transport particle filter inference methods require solving optimal transport problems. A C++ extension module (written in Cython) wrapping a network simplex linear programming based exact solver from the C++ graph library LEMON is included in the dapy.ot sub-package. Alternatively if available, solvers from the Python Optimal Transport library can be used. To install in the current environment run

conda install -c conda-forge pot

or using pip with

pip install POT

The PyFFTW may also optionally be used for more efficient fast Fourier transform computations in models using spectral expansions - install using

conda install -c conda-forge pyfftw

or using pip with

pip install pyfftw

The example Jupyter notebooks includes plots produced using Matplotlib. To be able to run the notebooks locally the following additional packages should be installed with conda using

conda install jupyter matplotlib

or using pip with

pip install jupyter matplotlib

Installing the dapy package

The package includes several Cython extension modules which are provided as both Cython and C / C++ source. To build the extensions directly from the C / C++ source files (which does not require Cython to be installed) run

python setup.py build_ext

To build the extensions using Cython (install with conda install cython or pip install cython) run

python setup.py build_ext --use-cython

This will build the extension modules directly from the Cython source files and using Cython optimisations which give performance improvements at the cost of less safe array access (these can be disabled with optional argument --no-cython-opts).

The dapy package can then be installed in to the current environment by running

pip install .

or to install in editable mode

pip install -e .

References

  1. Netto, M. A., Gimeno, L. and Mendes, M. J. (1978). A new spline algorithm for non-linear filtering of discrete time systems. IFAC Proceedings Volumes, 11(1), 2123-2130.
  2. Lorenz, E. N. (1963). Deterministic nonperiodic flow. Journal of the atmospheric sciences, 20(2), 130-141.
  3. Lorenz, E. N. (1996). Predictability - A problem partly solved. In Proceedings of Seminar on Predictability (1). European Centre for Medium-Range Weather Forecasts.
  4. Majda, A. J. and Harlim, J. (2012). Filtering complex turbulent systems. Cambridge University Press.
  5. Kuramoto, Y. and Tsuzuki, T. (1976). Persistent propagation of concentration waves indissipative media far from thermal equilibrium. Progress of theoretical physics (55), pp. 356--369.
  6. Sivashinsky, G. (1977). Nonlinear analysis of hydrodynamic instability in laminar flames -- I. Derivation of basic equations. Acta Astronautica (4), pp. 1177--1206.
  7. Kalman, R. E. (1960). A new approach to linear filtering and prediction problems. Transactions of the ASME -- Journal of Basic Engineering, Series D, 82, pp. 35--45.
  8. Evensen, G. (1994). Sequential data assimilation with nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. Journal of Geophysical Research, 99 (C5), pp. 143--162
  9. Burgers, G.,van Leeuwen, P. J. and Evensen, G. (1998). Analysis scheme in the ensemble Kalman filter. Monthly Weather Review, (126) pp 1719--1724.
  10. Doucet, A., S. Godsill, and C. Andrieu (2000). On sequential Monte Carlo sampling methods for Bayesian filtering. Statistics and Computing, 10, 197-208.
  11. Tippett, M. K., Anderson, J. L., Bishop, C. H., Hamill, T. M. and Whitaker, J. S. (2003). Ensemble square root filters. Monthly Weather Review, 131, pp. 1485--1490.
  12. Hunt, B. R., Kostelich, E. J. and Szunyogh, I. (2007). Efficient data assimilation for spatiotemporal chaos: A local ensemble transform Kalman filter. Physica D: Nonlinear Phenomena, 230(1), 112-126.
  13. Gordon, N.J., Salmond, D.J. and Smith, A.F.M. (1993). Novel approach to nonlinear / non-Gaussian Bayesian state estimation. Radar and Signal Processing, IEE Proceedings F. 140 (2): 107--113.
  14. Reich, S. (2013). A nonparametric ensemble transform method for Bayesian inference. SIAM Journal on Scientific Computing, 35(4), A2013-A2024.
  15. Cheng, Y. and Reich, S. (2015). Assimilating data into scientific models: An optimal coupling perspective. In Nonlinear Data Assimilation, pp 75--118. Springer.
  16. Graham, M. M. and Thiery A. H. (2019). A scalable optimal transport based particle filter. arXiv preprint 1906.00507.