annual comparison notes

The "annual" comparison for alteration needs some changes. Here is the discussion with Ryan:

Ryan 2:01 PM Hey Nick just now trying to play around with getting the annual FFM alteration status to go. I have a question…because it’s unclear what/how the data should be entered currently. In the help, under the annual argument, the helps states it needs a TRUE/FALSE “indicating whether to run a year over year analysis. If TRUE, then the parameter percentiles changes and should be a data frame with only two columns - the first is still metric, but the second is just value representing the current year’s value for the metric”. However, the dataframe returned by the evaluate_gage_alteration has columns that are: Year, each metric… (edited) 2:02

# A tibble: 6 x 39
  Year  DS_Tim_Julian DS_Dur_WS DS_Tim `__summer_no_fl…
  <chr>         <dbl>     <dbl>  <dbl>            <dbl>
1 1951            254        96    345               75
2 1952            164       151    255              151
3 1953            199       155    290              155

2:03 I can transpose or reconfigure this to have metric and value as stated, but we still need a year column correct? Ideally if we feed a dataframe with year, metric, and value, we could return a the alteration dataframe exactly as is but it would then include a year column for each respective year for that gage/comid. 2:06 this isn’t urgent, but I realized I’m not sure how to get this to run in it’s current form:

tst <- ffcAPIClient::evaluate_gage_alteration(gage_id = 10255810, 
                                              token = ffctoken,
                                              comid = 22595619)
annual_tst <- ffcAPIClient::assess_alteration(
  percentiles = tst$ffc_results, # this is the part that needs clarifying...
  predictions = tst$predicted_percentiles,
  ffc_values = tst$ffc_results, 
  comid = tst$alteration %>% distinct(comid) %>% as.integer(),
  annual = TRUE)

Nick Santos 2:28 PM Hey Ryan, sorry, had to eat something or my brain was going to break 2:29 So, I think this is mostly a function of me, for some reason, being unable to understand what's needed completely. You ever have those scenarios where you talk about something with someone, and it makes perfect sense while they're describing it, you take notes, and then the moment the conversation is over, some critical part of the structure is missing? That's happening to me here (not sure why) - The thing you're describing to me seems simple, and I keep getting it slightly wrong 2:29 So, it sounds like you want to take the FFC results DF and run it through rather than the percentiles DF 2:29 is that correct? 2:31 Almost like that example should become:

# A tibble: 6 x 39
  Year  DS_Tim_Julian DS_Dur_WS DS_Tim `__summer_no_fl…
  <chr>         <dbl>     <dbl>  <dbl>            <dbl>
1 1951          FALSE      TRUE   FALSE            TRUE
2 1952          FALSE     FALSE    TRUE           FALSE
3 1953          FALSE     FALSE    TRUE           FALSE

for whether it's altered, based on how each value fits into each metric's predicted percentiles?

Ryan 4:03 PM yes….but instead of TRUE/FALSE perhaps just using the -1/0/1 codes? but that’s exactly it. Feed FFC results DF (so metrics that are calculate for each year of data) to the predicted percentiles and see if they fall inside the 20/80 percentile range. 4:03 and no worries…this isn’t easy to figure out all the details for on the fly! I really appreciate all you’ve been doing.

Nick Santos 4:09 PM OK, and just to clarify, theoretically, there wouldn't be many years of data - that's why you'd be doing the annual, right? It's because there's too little data for reliable percentiles? 4:09 And yes on using the codes instead - forgot that's what it's doing now 4:10 I should be able to make that tweak without too much trouble

Ryan 4:14 PM there could be many years…actually hopefully there will be…although I suppose I could filter to just the years I want first. 4:16 but by comparing flow year to same biological sampling year, or flow year to a lagged sample year, we may be able to take a look at things like drought impacts, etc. In particular, which metrics are responsive to this annual view and which are not. There will be lots of noise for sure, and for some it may not be able to calculate, but this is a fairly narrow use case (I hope). 4:17 basically taking a one single metric calculated in one single year and seeing if it falls in the predicted percentiles, but dataframe style

Nick Santos 4:17 PM ok, gotcha - so realistically, you're probably more likely to be replacing predicted percentiles with a reference year, and it isn't always about the amount of data 4:17 (I know you've explained this to me, so sorry!)

Ryan 4:18 PM not sure what you mean by replacing with year….? 4:18 also probably doing bad job explaining. :slightly_smiling_face:

Nick Santos 4:19 PM Sorry, with a reference year's values, rather than actual predicted percentile values? 4:19 Nah, definitely not you - I'm tying my brain in knots, and I swear it's made sense each time you've explained it, then it just....flies away

Ryan 4:23 PM ah yes I see…yea I think the function is pretty much there, but we only need to feed it:

annual_tst <- ffcAPIClient::assess_alteration(
  predictions =  "predicted_percentiles",
  ffc_values = "ffc_results", 
  comid = "comid"
  annual = TRUE)

ceff-tech / ffc_api_client

annual comparison notes #40