Closed aappling-usgs closed 6 years ago
Noooo!!! Scale Factor. Looking into it.
Agreed, I think scale factor is a good candidate for explaining this. Thanks for looking into it, @dblodgett-usgs .
Dave found it :+1:
https://github.com/USGS-R/protoloads/blob/master/2_munge/src/aggregate_inputs.R#L77 That causes the data to come in without the scale factor applied! Which corrected the old data but breaks the new data.
The "solid hack" for this week:
1) confirm that the scale factor in the .nc files is always 100
2) change aggregate_inputs code and the aggregated NWM output, BUT
3) not update models; instead apply a temporary divide-by-100 in gather_forecasts
so that the outputs collected in 3_forecast/out/preds_loadest.rds and 3_forecast/out/preds_wrtds.rds appear to be correct
raw_nwm_files <- c(
'1_data/out/nwm_ana.nc',
'1_data/out/nwm_retro.nc',
'1_data/out/nwm_med.nc',
'1_data/out/nwm_long1.nc',
'1_data/out/nwm_long2.nc',
'1_data/out/nwm_long3.nc',
'1_data/out/nwm_long4.nc')
vapply(setNames(nm=raw_nwm_files), function(raw_nwm_file) {
input_raw <- nc_open(raw_nwm_file)
scaleFact <- input_raw$var$streamflow$scaleFact
nc_close(input_raw)
return(scaleFact)
}, FUN.VALUE=numeric(1)) %>% {data_frame(file=names(.), scaleFact=.)}
# A tibble: 7 x 2
# file scaleFact
# <chr> <dbl>
# 1 1_data/out/nwm_ana.nc 0.01000
# 2 1_data/out/nwm_retro.nc 0.01000
# 3 1_data/out/nwm_med.nc 0.01000
# 4 1_data/out/nwm_long1.nc 0.01000
# 5 1_data/out/nwm_long2.nc 0.01000
# 6 1_data/out/nwm_long3.nc 0.01000
# 7 1_data/out/nwm_long4.nc 0.01000
OK, I think all we need to do from here (after my upcoming PR) is avoid running scmake('3_forecast','3_forecast.yml')
until we're ready to wait 30 hours.
Aggregated inputs are now correct and correctly derived from the nc files (using scaleFact as intended). Aggregated model outputs are correct but incorrectly derived (using the hack of dividing by 100 in gather_forecasts()
).
I just finished re-running all the models (about 30 hours); see #67 .
Now I'm rebuilding the old figures. I think there has been a 100x increase in the NWM forecast magnitudes from the last round of model outputs to this one. It could be happening anywhere in the pipeline, from NWM pull to aggregation to modeling, because we had to modify each of those steps a bit this week.
Forecasts plot from today (for WRTDS):
Forecasts plot from last time (also for WRTDS):
Inputs plot from today:
Inputs plot from last time:
Errors plot from today:
Errors plot from last time: