USGS-R / protoloads

Prototyping and exploring options for broad-scale load forecasting
0 stars 4 forks source link

fig: error split by high/low flow or season #56

Closed aappling-usgs closed 6 years ago

aappling-usgs commented 6 years ago

img_20180510_162139395

jzwart commented 6 years ago

could group by flow and model range: image

or just group by flow: image

aappling-usgs commented 6 years ago

that's a really interesting result! so high flow fluxes are super biased until you get down to very short lead times, huh? and also biased for low flows but less so and with higher frequency of accurate estimates. also interesting that it's a low bias - seems like it's common for load models to overestimate rather than underestimate loads at high flows, isn't it, @ldecicco-USGS ?

I like the grouping just by flow b/c it tells a very similar story but more digestibly - i think that's just right for a poster.

do these plots vary a lot if you break them out by site? i'm wondering how much of this would generalize to a large number of sites versus being something one could understand by understanding historical bias/error at a specific site.

jzwart commented 6 years ago

this is just one site. If you split out by site it's a different story depending on site (I don't know why site names aren't plotting but they're in same order as error_v_leadtime plot). Also I should mention that high flow > median flow and low flow < median flow : image

slightly different if high flow > 75th percentile and low flow < 25th percentile: image Differences between these graphs could be driven by relative flux where magnitude in flux error doesn't change a whole lot across flow regimes but observed flux does... will have to look into this more and also make a decision on what is high vs. low flow

aappling-usgs commented 6 years ago

oh, cool. not so consistent a story, but still cool =). i think it's useful that the 3rd panel shows that even 0 lag can lead to not-the-best predictions, and that the 2nd panel shows really really high relative error.

we're getting pretty good separation either way, but i think i prefer the second version in this most recent comment (lower quartile and higher quartile only. could maybe add in 25-75 as a middle bar...or maybe it's great just as it is.

jzwart commented 6 years ago

with medium flow added: image

aappling-usgs commented 6 years ago

Hmm - more complex, but I think I do like this most recent version best. You agree?

jzwart commented 6 years ago

Yeah I think i do too. what about the number of Lead Times? is 4 enough? or could go with a few more: image

aappling-usgs commented 6 years ago

I think 4 captures it really nicely, especially now that you've shared the more detailed version to confirm that there aren't additional major patterns hidden in there.