OpenWaterAnalytics / EPANET

The Water Distribution System Hydraulic and Water Quality Analysis Toolkit
MIT License
276 stars 204 forks source link

Methodology for Validation #169

Open samhatchett opened 6 years ago

samhatchett commented 6 years ago

In addition to regression and unit testing, I believe the project could benefit from integrating a statistical validation procedure into the mix.

There are some references in other fields - like from Hassan and Bennett

A basic strategy would be to have a set of inp files that describe experimental setups, and provide real measurement data from field evaluations. Then use peer-reviewed algorithms like the wonderfully-named "GLUE" (Generalized Likelihood Uncertainty Estimator) or the perhaps easier to parse Stochastic Validation Approach in Luis and McLaughlin to test.

In all, this would go some way to helping get around the questions of reproducibility - namely, that regression testing also constrains you to reproduce errors present in past versions of code, rather than working toward better statistical validity which is the real point.

A special note - I have to credit Rakesh Bahadur at Leidos for directing me to this research and encouraging us to pursue this validation approach. My hat is off to him for seeing the bigger picture.

LRossman commented 6 years ago

Please don't conflate the notions of model verification and model validation. They are two different concepts. See link.

samhatchett commented 6 years ago

Thanks @LRossman - I suppose this deserves a fuller conversation, but I totally agree. Verification can be done through unit testing, but validation must be done statistically, right? Do you have any historical perspective on either question as it relates to EPANET?

LRossman commented 6 years ago

With regards to EPANET's hydraulic solver, verification can be built in to the code itself in the form of having it report the error in meeting the conservation of energy and flow equations it is solving (although I suppose a regression test of new versions is still needed to confirm that these errors are being computed correctly). With water quality, an internal verification check would be the reporting of mass balance errors.

Model validation is a tougher nut to crack since it depends on so many uncontrollable factors. Plus there is a philosophical belief that models of open systems can never be fully validated (see the classic paper by Orestes et al. don't be confused by their switching the definitions of verification and validation). Instead of trying to show that a model is valid, statistically or otherwise, it might be more productive to provide empirical evidence that it provides useful information for its intended use.

dickinsonre commented 6 years ago

Hello Lew, Is that the right link for the Orestes et al paper?

LRossman commented 6 years ago

Re: the Orestes et al. paper - if left clicking on the lnk doesn't work, try right clicking it and select "Search Google for "....'".

dickinsonre commented 6 years ago

Thanks @LRossman I am happy that your suggestion worked for me and I now have the paper.