Expand & automate testing system

aufdenkampe commented 4 years ago

As we discussed for our RESPEC-LimnoTech Collaborative Work Plan during our workshop (March 24-25, 2020), expanding and automating the testing of HSP2 vs. HSPF is an immediate priority.

Our objective for testing is to ensure that HSP2 provides the same results as HSPF for:

all HSP2 releases,
several relevant operating systems and software environments, and
a selected group of watershed models that have been calibrated and examined for real-world water management applications and that represent a range of watershed properties.

We decided that:

HSPF “reference” model runs should be added to repo and considered static/stable
HSP2 outputs will continually evolve, expanding as new process modules are implemented
Comparisons will be point-by-point for major output time series, to "byte-precision" of about 3 significant figures to allow for rounding errors
- We can't do traditional unit testing of individual routines because HSPF doesn't save that data.

RESPEC has two test models to contribute:

Test10
Calleg

LimnoTech will add additional models:

We selected 2 watersheds that we’ve recently modeled in HSPF, and selected a single sub-watershed (to simplify running)
- Grant River, MI. Relatively simple
- Zumbro River, MN. More complicated. Full water quality suite.
Hydrological Response Unit (HRU) testing
- 5-10 micro watersheds (1 HRU + a few stream reaches)

Let's use this issue to track progress on all the smaller tasks required to complete this. We have already added some reference models and testing code with https://github.com/respec/HSPsquared/commit/49c71f3868da031d60a9bbd284e295411df484e8, https://github.com/respec/HSPsquared/commit/60378a7adb1231a8a4f3426133bc9bcedcf2d309, and https://github.com/LimnoTech/HSPsquared/commit/130bef26ead195ffa364d5b4f01928c5f1cd5c78.

cc: @rheaphy, @PaulDudaRESPEC, @steveskrip, @ptomasula,

rheaphy commented 4 years ago

Hi, One thing that would be useful in (at least some) test cases would be actual watershed measurement data. It is one thing to try to validate HSP2 against HSPF - but in the longer view HSP2 should be validated against actual data too. HSPF test 10 didn't come with this. I am not sure if the colleg test case might have actual data archived somewhere, but I am not aware of any.

Bob

On Sun, Apr 19, 2020 at 1:40 PM Anthony Aufdenkampe < notifications@github.com> wrote:

As we discussed for our RESPEC-LimnoTech Collaborative Work Plan https://docs.google.com/document/d/1jgH_-ly9_VcW_fYRGZOQqv12LY1oALd-tQMLFgOCMYQ/edit#bookmark=id.vhiwqmnyt24j during our workshop (March 24-25, 2020), expanding and automating the testing of HSP2 vs. HSPF is an immediate priority.

Our objective for testing is to ensure that HSP2 provides the same results as HSPF for:

all HSP2 releases,

several relevant operating systems and software environments, and

a selected group of watershed models that have been calibrated and examined for real-world water management applications and that represent a range of watershed properties.

We decided that:

HSPF “reference” model runs should be added to repo and considered static/stable

HSP2 outputs will continually evolve, expanding as new process modules are implemented

Comparisons will be point-by-point for major output time series, to "byte-precision" of about 3 significant figures to allow for rounding errors

We can't do traditional unit testing of individual routines because HSPF doesn't save that data.

RESPEC has two test models to contribute:

Test10

Calleg

LimnoTech will add additional models:

We selected 2 watersheds that we’ve recently modeled in HSPF, and selected a single sub-watershed (to simplify running)

Grant River, MI. Relatively simple

Zumbro River, MN. More complicated. Full water quality suite.

Hydrological Response Unit (HRU) testing

5-10 micro watersheds (1 HRU + a few stream reaches)

Let's use this issue to track progress on all the smaller tasks required to complete this. We have already added some reference models and testing code with 49c71f3 https://github.com/respec/HSPsquared/commit/49c71f3868da031d60a9bbd284e295411df484e8, 60378a7 https://github.com/respec/HSPsquared/commit/60378a7adb1231a8a4f3426133bc9bcedcf2d309, and LimnoTech@130bef2 https://github.com/LimnoTech/HSPsquared/commit/130bef26ead195ffa364d5b4f01928c5f1cd5c78 .

cc: @rheaphy https://github.com/rheaphy, @PaulDudaRESPEC https://github.com/PaulDudaRESPEC, @steveskrip https://github.com/steveskrip, @ptomasula https://github.com/ptomasula ,

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/respec/HSPsquared/issues/31, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFML2ENGI7HBUBDDEXMOBTDRNNHRHANCNFSM4ML5PAFA .

JasonLoveRespec commented 4 years ago

Bob

I don’t necessarily disagree but HSPF has been around and used to develop applications that are calibrated to data for over 40 years and determined to be a useful tool through that experience. Thus, I somewhat believe constraining the scope to validate HSP2 can reproduce HSPF results has merit.

JASON LOVE, P.E. Senior Vice President

605.394.6512 office // 605.484.2380 cell

aufdenkampe commented 3 years ago

Here's a suggestion from @PaulDudaRESPEC for potential tests that we can adapt from the HSPF testing list:

I went back to our archive of standard HSPF tests to see what we have available there. I’ve summarized them below, and I highlighted the ones that I think might be helpful. They are all available here: https://github.com/respec/FORTRAN/tree/master/test/hspf/standard/Current

I’m pretty sure you’re already using Test10, which has a lot of WQ in it. The other ones I suggest would be Test07 and Test08, which exercise most of the PERLND WQ modules. The only hitch is that they also use Special Actions, and since I don’t think we have an equivalent to Special Actions yet in HSP2 we may need to comment out that part of the UCIs – for the purposes of matching the HSP2 results to the HSPF results.

Here’s the list of standard test runs:

TEST01.uci, TEST02.uci, TEST03.uci – Importing data to WDM (MUTSIN, COPY)

TEST04.uci – Display of data in WDM

TEST05.uci – PERLND with SNOW, PWATER

TEST06.uci – DURANL (duration analysis)

TEST07.uci – PERLND with ATMP, SNOW, PWATER, SEDMNT, PSTEMP, PWTGAS, PQUAL, MSTLAY, PEST, and Special Actions

TEST08.uci – PERLND with SNOW, PWATER, SEDMNT, PSTEMP, MSTLAY, NITR, PHOS, TRACER, and Special Actions

TEST09.uci – PERLND with SNOW, PWATER, RCHRES HYDR

TEST10.uci – PERLND with SNOW, PWATER, PSTEMP, PWTGAS, IMPLND with SNOW, IWATER, SOLIDS, IWTGAS, IQUAL, RCHRES with HYDR, ADCALC, CONS, HTRCH, SEDTRN, GQUAL, OXRX, NUTRX, PLANK, PHCARB

TEST11.uci – PERLND SNOW and PWATER with metric units

TEST12.uci – PERLND with SNOW, PWATER, SEDMNT, PSTEMP, MSTLAY, NITR, PHOS IMPLND with SNOW, IWATER, SOLIDS, IWTGAS, IQUAL RCHRES with HYDR, ADCALC, HTRCH, SEDTRN, OXRX, NUTRX, PLANK *Uses DSS, conditional, user defined and distributed special actions

TEST13.uci – RCHRES HYDR with water categories and conditional special actions

TEST14.uci – Inputting test data to DSS

TEST15.uci – PERLND PWATER with IHM changes and binary output

TEST16.uci – PERLND SNOW, PWATER, SEDMNT, PSTEMP, PWTGAS IMPLND SNOW, IWATER, SOLIDS, IWTGAS RCHRES HYDR, ADCALC, HTRCG, SEDTRN *uses multiple canopy layer enhancement

TEST17.uci – PERLND with SNOW, PWATER, PSTEMP, PWTGAS, IMPLND with SNOW, IWATER, SOLIDS, IWTGAS, IQUAL, RCHRES with HYDR, ADCALC, CONS, HTRCH, SEDTRN, GQUAL, OXRX, NUTRX, PLANK, PHCARB *same as TEST10 except uses DO in PRECIP RCHRES enhancement

cc: @steveskrip, @benjamincrary

aufdenkampe commented 3 years ago

For our records...

In Bob's May 14 "HSP2 status update" email, he shares the following:

Yesterday morning I updated the HSPsquared GitHub main branch repository. For the Calleg test case, PWATER is now as accurate as in HSPF (worst case was 0.0003 percent difference or almost exact to 6 significant digits) and RCHRES HYDR improved so the worst case is 0.0207 percent difference. IWATER was already an exact match to HSPF to single precision accuracy.

I found the two bugs when testing HSP2 for indelt times under 1 hour. I am still improving my testing to insure HSP2 works at any reasonable indelt. (I am currently testing 5 hour and 20 minute along with the original I hour tests.

In Bob's May 14 "HSP2 HTML image of calleg test run" email, he wrote:

I am attaching a zip compressed HTML file which is an HTML export of the Jupyterlab Notebook for the Calleg test - in case anyone wants the details for my previous email.

I added TESTcalleg.html to the repo on May 19 via 9882d83 as part of PR #34. We do not yet have a copy of the TESTcalleg.ipynb that created those results.

aufdenkampe commented 3 years ago

I just discovered that Bob emailed @steveskrip the file on May 28 in "Re: calleg HSPF results" thread.

I just committed it with https://github.com/LimnoTech/HSPsquared/commit/f112f9e7139d7fb88fbc2a094b0ff535173286e0.

aufdenkampe commented 3 years ago

@TongZhai, thanks for all your recent commits (c556ff777846a3fabb5d2a4cc0129670fa5d7631, 15f111b8a00c5a61706e3bd6d8babe2678902803, 8a2dce703e39ed0d976de51cb9a41620654fdd41, 503e95a6928baa23d264f0f4f184b9fc2e58eac6, 5d32b3ecfdc5d98c4266ffeb90bfa09ae7668ef1) toward building a unit testing / regression testing system!

Are these ready for us to use?

I created this Pull Request (https://github.com/LimnoTech/HSPsquared/pull/36) to the cumulative changes with your code.

TongZhai commented 3 years ago

So far, the test is implemented for testing conversion codes. I put an instruction as commit in this thread. Yes, it is currently functional in conversion testing. It is intended to be a full suite of testing protocols.

aufdenkampe commented 2 years ago

PR #61 includes:

improved tested capabilities that expands to additional use cases, i.e.:
- https://github.com/LimnoTech/HSPsquared/pull/46
new RQUAL module code from:
- 58
substantially enhance performance, as described in this comment:
- https://github.com/LimnoTech/HSPsquared/pull/46#issuecomment-929393070

aufdenkampe commented 2 years ago

Although this is a long-term goal, we have sufficiently established a performant and flexible testing system that this issue can be closed with release 0.9.3.

We'll open new issues for specific additional enhancements to the testing system.

respec / HSPsquared

Expand & automate testing system #31

58