Work began around 2020 envisioning an improved and increasingly automated way to
calibrate dvmdostem. This effort began with code that could perform a
rudimentary sensitivity analysis. Efforts over the next several years saw
additions that could combine a sensitivity analysis with a subsequent
optimization step that calculates optimized (calibrated) parameters. These two
components ("sensitivity" and "calibration") were assembled into a a set of
steps that team members used to calibrate dvmdostem for new sites and
conditions. The code for this was developed with a reserarch and development
mindset that resulted in a solid working prototype. This prototype is the state
of the calib branch as of commit 27e3d41bfaa34cde9bf5cb1c9ac4736dc623bc3a
(Sept 21, 2023).
The pull request is the refactoring that was inspired by feedback and
observations of the folks using the prototype. Hopefully the changes here are
more intutitve for users trying to calibrate dvm-dos-tem and make the code
easier to expand upon.
Goals of refactoring
re-use common code, remove duplicated code
streamline naming conventions, make process more intutitive
make naming across code and documentation more consistent
consolidate and update documentation
leave room for extensibility (i.e using an alternate optimization algorithm),
add more analysis and visualization, etc.
High level code changes
Built Python class hierarchy for "driver" objects so that similar code
between the sensitivity analysis and the Mads optimization step could be
shared.
Added command line interface to docker-build-wrapper.sh with control over
which images are built.
Improved aspects of developer experience in Docker containers relating to
default PATH variables and ease of importing scripts.
Re-organized the parts of the documentation to try and describe the
sensitivty analysis and optimizaiton (MADS) tools and workflow.
Additions
Added some testing data and several more tests. This pattern is not entirely
settled yet, but rough idea is that data required for tests should be kept in
a testing-data/ folder at the root of the project. This folder will be
tracked with git-lfs to avoid bloating the repo for uses that don't
require running the test suite. The paths need to be updated in many of the
existing tests and some of the data still need to be be added to the
testing-data/ directory and comitted.
util/metrics.py python script with stubs for various plotting and
measurement functions (e.g. RMSE, MAPE, etc)
Stubs for doing an "equlibrium analysis" as part of the Sensitivity Analysis.
Plotting functions and analysis for doing a Nitrogen limitation check.
Concept of "seed path" for parameter and targets in SA and CA
Term/phrase/concept migration
run_mads_sensitivity.py --> replaced by -->
SA_setup_and_run.py
SA_post_hoc_analysis.py --> attempts to collect -->
mads_calibration/utils.py, various utils circulated by email
"workflow" and "steps" --> clarified by -->
reccomended directory layout and attempt at making documentation more
consistent
Sensitivity Analysis --> referred to as --> SA
Calibration --> referred to as --> CA, ca, optimizaiton, Mads Assisted Calibration
AC --> synonym for --> "assisted calibration" or "Mads Assisted Calibration"
Auto-Calibration --> replaced by --> Assisted Calibration
Targets --> synonym with --> Observations
Known Issues
The core functionality of a Sensitivity Analysis being used to feed a Mads
optimization (calibration) has been maintained from the calib branch
prototype (commit 27e3d41bfaa), but not all code and options have been
accumulated from various scripts still in the repo and various email/slack
chains that were circulated.
The documentation has not been finished for all functions
The documentation and examples have not been updated (or removed, or
migrated to the sphinx documentation)
Testing
With the help of @Benjamin-Maglio we have extensively tested the sensitivity
analysis tools and process and the Mads optimization process and are confident
that the behavior is matching or exceeding the behavior of the pre-refactored
code (commit 27e3d41b). The C++ model core was not modified nor the parameters
so the model's scientific behavior has not been changed.
Motivation and background
Work began around 2020 envisioning an improved and increasingly automated way to calibrate
dvmdostem
. This effort began with code that could perform a rudimentary sensitivity analysis. Efforts over the next several years saw additions that could combine a sensitivity analysis with a subsequent optimization step that calculates optimized (calibrated) parameters. These two components ("sensitivity" and "calibration") were assembled into a a set of steps that team members used to calibratedvmdostem
for new sites and conditions. The code for this was developed with a reserarch and development mindset that resulted in a solid working prototype. This prototype is the state of thecalib
branch as of commit 27e3d41bfaa34cde9bf5cb1c9ac4736dc623bc3a (Sept 21, 2023).The pull request is the refactoring that was inspired by feedback and observations of the folks using the prototype. Hopefully the changes here are more intutitve for users trying to calibrate
dvm-dos-tem
and make the code easier to expand upon.Goals of refactoring
High level code changes
Additions
testing-data/
folder at the root of the project. This folder will be tracked withgit-lfs
to avoid bloating the repo for uses that don't require running the test suite. The paths need to be updated in many of the existing tests and some of the data still need to be be added to thetesting-data/
directory and comitted.util/metrics.py
python script with stubs for various plotting and measurement functions (e.g. RMSE, MAPE, etc)Term/phrase/concept migration
Known Issues
calib
branch prototype (commit 27e3d41bfaa), but not all code and options have been accumulated from various scripts still in the repo and various email/slack chains that were circulated.Testing
With the help of @Benjamin-Maglio we have extensively tested the sensitivity analysis tools and process and the Mads optimization process and are confident that the behavior is matching or exceeding the behavior of the pre-refactored code (commit 27e3d41b). The C++ model core was not modified nor the parameters so the model's scientific behavior has not been changed.