redmod-team / profit

Probabilistic Response mOdel Fitting with Interactive Tools
https://profit.readthedocs.io
MIT License
14 stars 10 forks source link
active-learning gaussian-processes model-emulation polynomial-chaos-expansion reduced-order-models reduced-order-surrogate-model surrogate uncertainty-quantification uq

DOI PyPI Python Versions Code style: black Coverage Status

Documentation Status Install & Test Status pre-commit.ci status Publish to PyPI Status

Probabilistic Response Model Fitting with Interactive Tools

This is a collection of tools for studying parametric dependencies of black-box simulation codes or experiments and construction of reduced order response models over input parameter space.

proFit can be fed with a number of data points consisting of different input parameter combinations and the resulting output of the simulation under investigation. It then fits a response-surface through the point cloud using Gaussian process regression (GPR) models. This probabilistic response model allows to predict ("interpolate") the output at yet unexplored parameter combinations including uncertainty estimates. It can also tell you where to put more training points to gain maximum new information (experimental design) and automatically generate and start new simulation runs locally or on a cluster. Results can be explored and checked visually in a web frontend.

Telling proFit how to interact with your existing simulations is easy and requires no changes in your existing code. Current functionality covers starting simulations locally or on a cluster via Slurm, subsequent surrogate modelling using GPy, scikit-learn, as well as an active learning algorithm to iteratively sample at interesting points and a Markov-Chain-Monte-Carlo (MCMC) algorithm. The web frontend to interactively explore the point cloud and surrogate is based on plotly/dash.

Features

Installation

Currently, the code is under heavy development, so it should be cloned from GitHub via Git and pulled regularly.

Requirements

sudo apt install python3-dev build-essential

To enable compilation of the fortran modules the following is needed:

sudo apt install gfortran

Dependencies

All dependencies are configured in setup.cfg and should be installed automatically when using pip.

Automatic tests use pytest.

Windows 10

To install proFit under Windows 10 we recommend using Windows Subsystem for Linux (WSL2) with the Ubuntu 20.04 LTS distribution (install guide).

After the installation of WSL2 execute the following steps in your Linux terminal (when asked press y to continue):

Make sure you have the right version of Python installed and the basic developer toolset available

   sudo apt update
   sudo apt install python3 python3-pip python3-dev build-essential

To install proFit from Git (see below), make sure that the project is located in the Linux file system not the Windows system.

To configure the Python interpreter available in your Linux distribution in pycharm (tested with professional edition) follow this guide.

Installation from PyPI

To install the latest stable version of proFit, use

pip install profit

For the latest pre-release, use

pip install --pre profit

Installation from Git

To install proFit for the current user (--user) in development-mode (-e) use:

git clone https://github.com/redmod-team/profit.git
cd profit
pip install -e . --user

Fortran

Certain surrogates require a compiled Fortran backend. To enable compilation of the fortran modules during install:

USE_FORTRAN=1 pip install .

Troubleshooting installation problems

  1. Make sure you have all the requirements mentioned above installed.

  2. If pip is not recognized try the following:

    python3 -m pip install -e . --user
  3. If pip warns you about PATH or proFit is not found close and reopen the terminal and type profit --help to check if the installation was successful.

Documentation using Sphinx

Install requirements for building the documentation using sphinx

pip install .[docs]

Additionally pandoc is required on a system level:

sudo apt install pandoc

HowTo

Examples for different model codes are available under examples/:

Also, the integration tests under tests/integration_tests/ may be informative examples:

Steps

  1. Create and enter a directory (e.g. study) containing profit.yaml for your run. If your code is based on text configuration files for each run, copy the according directory to template and replace values of parameters to be varied within UQ/surrogate models by placeholders {param}.

  2. Running the simulations:

    profit run

    to start simulations at all the points. Per default the generated input variables are written to input.txt and the output data is collected in output.txt.

    For each run of the simulation, proFit creates a run directory, fills the templates with the generated input data and collects the results. Each step can be customized with the configuration file.

  3. To fit the model:

    profit fit

    Customization can be done with profit.yaml again.

  4. Explore data graphically:

    profit ui

    starts a Dash-based browser UI

The figure below gives a graphical representation of the typical profit workflow described above. The boxes in red describe user actions while the boxes in blue are conducted by profit.

Cluster

proFit supports scheduling the runs on a cluster using slurm. This is done entirely via the configuration files and the usage doesn't change.

profit ui starts a dash server and it is possible to remotely connect to it (e.g. via ssh port forwarding)

User-supplied files

Example directory structure: