respec / BASINS

BASINS source code development repository. For official releases see EPA:
https://www.epa.gov/exposure-assessment-models/basins
29 stars 16 forks source link

[HSPEXP+] In all calculations, HSPEXP+ should account for missing observations #28

Closed timcera closed 6 years ago

timcera commented 6 years ago

As mentioned in https://github.com/respec/BASINS/issues/24 HSPEXP+ is crashing at/near the end of printing the plots and statistics for one of the calibration gauges. Looking at the gauge calibration results calculated by HSPEXP+ it is obvious that some of the statistics are calculated without regard to missing calibration data.

See zipped up results, which indicate a very large volume error in 'ExpertSysStats-03060101-rch7.txt' and in '03060101.RCH7advice.txt', yet other statistics are calculated correctly (for example, 'AnnualFlowStats-RCH7.txt', 'MonthlyAverageFlowStats-RCH7.txt', and 'DailyMonthlyFlowStats-RCH7.txt').

Most of the stats in 'ExpertSysStats-03060101-rch7.txt' are correct, with total volume being the stand-out incorrect value. There may be others that are also incorrect; I didn't separately recalculate.

The files 'AnnualFlowStats-RCH7.txt', 'MonthlyAverageFlowStats-RCH7.txt', and 'DailyMonthlyFlowStats-RCH7.txt' indicate at the top that the end data was adjusted. There is no similar statement at the top of 'ExpertSysStats-03060101-rch7.txt' - which could solve most of the problem.

Philosophically, I don't fill in observation data. I realize that I am in a minority, but tough. The tools that I use for calibration, mainly TSPROC, does not have a problem with that. I think the ideal situation would be to appropriately work with observation data which might have missing values. I realize the complexity that might be needed to do that, so would gladly settle for automatic adjustment of analysis start/end dates to match the observed data. Reports_201805241049.zip

mishranurag commented 6 years ago

Tim, could you please try adding analysis dates in the EXS file and let me know if that solves the problem?

Thanks ~A

timcera commented 6 years ago

Here is my current, unedited, exactly as WinHSPF 3.1 made it '03060101-RCH7.exs' file.

03060101-RCH7.zip

I see I could adjust the dates in line 3, however, that is missing the point. The computer should do things for me and if it can't should error - not present and use incorrect calculations. Either WinHSPF should create a better '*.exs' file or HSPEXP+ should uniformly adjust start and end dates.

My feature request would be from most desirable to least:

  1. HSPEXP+ would correctly handle all missing values in the observation record for all calculations.
  2. HSPEXP+ would adjust analysis start and end dates to match the observed record for all calculations. If there are missing observed values in the analysis period HSPEXP would warn and/or error.
  3. WinHSPF would adjust calibration start and end dates to match the observed record when building the *.exs file.
  4. Whether 3 is implemented or not, at a minimum HSPEXP+ should warn or error if there are missing values in the observed data set. This is a minimum I think because right now, incorrect statistics are reported and used for development of advice.
mishranurag commented 6 years ago

I agree with you Tim. I made changes to the code and implemented your most desirable option. I will try and make an executable and post it.

timcera commented 6 years ago

Wow! That is great! Thanks!

mishranurag commented 6 years ago

There was still an issue in limiting the expert calculation when observed data ended before the simulated data. I fixed that today and it will be available in the next version.

I will keep this issue open, in case the results do not come out as expected.

mishranurag commented 6 years ago

The number of years calculated for expert statistics rounded to an integer. This skewed calculations for very short calibration period. This issue has been fixed for future releases.