USGS-R / regional-hydrologic-forcings-ml

Repo for machine learning models for regional prediction of hydrologic forcing functions. Includes probabilistic seasonal high flow regions for CONUS, and prediction of high flow metrics for selected regions.
Creative Commons Zero v1.0 Universal
0 stars 4 forks source link

Plot the time period spanned by each of our gages #133

Closed jds485 closed 2 years ago

jds485 commented 2 years ago

Plot the time period spanned by each of our gages to see if we're currently biased to a particular time period Envisioning a graphic like the coverage plot in the targets-3 training (3_visualize step).

jds485 commented 2 years ago

@slevin75 do you want to make this plot?

slevin75 commented 2 years ago

ok - this is what I have so far. There is a huge drop off in gages in the mid 1980s which is very interesting. Since we have so many gages, should I group these in some way - by state or cluster id, or something?

image image

slevin75 commented 2 years ago

OK - how about this?

image image

slevin75 commented 2 years ago

Hey - I re-opened this issue because I can't remember if we resolved what we want to do with the estimated data. I remade the plots using the estimated data only lower than the median. There are 1444 gages using this rule and I've put the plots below, you can compare them to the plots above without any estimated data.

I am not sure if we decided to keep all of the estimated data or just the low estimated data? Anyway, if there was a decision on that, let me know and I can put a pr in for this.

image

jds485 commented 2 years ago

Thanks! Is this plot showing colors only for complete years? Or is this still by day?

slevin75 commented 2 years ago

This plot is complete years

jds485 commented 2 years ago

Okay, do you also have plots when all estimated data are included? There are plots in this comment that are by day instead of binary complete year or not. But looking at those plots suggests that we gain quite a lot more from including estimated data > median and I'd suggest that we include all estimated data

slevin75 commented 2 years ago

I don't have these new plots using all the estimated data but yes, we will gain a ton of data later in the period. OK, I will make those changes and put a pr in. Looks like tallgrass is very busy today and I am taking off early today, so it might not get done until Monday.

jds485 commented 2 years ago

I might be misremembering, but didn't you already make a PR for which all estimated data was included? And saved results to a separate Tallgrass directory?

slevin75 commented 2 years ago

actually - I don't think I did have a pr for it yet. I made a new branch and have been running it in a separate tallgrass directory but I don't think I ever put in a pr for it because we were still trying to figure out if we were going to use all the estimated data or just those under the median. If I did put in a pr, I can't seem to find it, lol

jds485 commented 2 years ago

Okay, sounds good! If you put that branch into a PR, I can probably review later today. I think you already have the computation complete, so might just be a data transfer is needed from your fork to main