Closed alyssaobester closed 1 year ago
Ryan and I discussed this yesterday, copying discussion here:
Ryan Peek: Would this be a pretty quick fix? I can probably take a deeper look this week, but thought you might have a quick thought. I asked for Alyssa to file an issue so we can track this.
Nick Santos: Yeah - basically, right now, it just runs a deduplicate operation on the table, then checks if it has fewer records than before it deduplicated the table and throws the error (but doesn't check which record was kept) Is NFD Natural Flows Database?
I think I don't know enough here to suggest a default behavior either, since it only occurs on some items, but I think all that would need to be done is to add a parameter to the function that pulls the metrics (and maybe to a few upstream functions) that sets how it chooses between multiple options (probably just a keyword that it checks whichever column has this info). It's possible that we could deprecate the offline version and just use the TNC API, and then have that flag to switch between them otherwise, there might be some edge cases to document but if we deprecate the old code, we'll probably want to remove a few flags as well :slightly_smiling_face: the code is in the utils_flow_metrics.R module
Ryan 4:58 PM great thanks…sort of what I figured. Seems folks want to have all the pieces…but I see your point about edgecases. I think at this point whatever is easiest and most concise to maintain/export would be my choice.
Nick Santos 5:00 PM yeah, I think the simplest thing is that if this is just a bugfix, change the deduplication behavior and be done with it. If there are cases where someone might want to choose which record to use, then the chain of functions there would need a parameter added
As an update, I think Ryan may have fixed this in #74 , but it never got merged and I'm checking with him on that now.
Merged!
When you use the package to pull flow metrics for a non-reference gage (i.e., the purple reaches on the NFD), the function that pulls the metrics removes any inferred (modeled) flow metric values and retains observed values. This means that instead of showing the modeled natural ranges, it’s actually providing the altered flow metric ranges. The package does give you a warning message when this occurs (see below), but I was hoping we could either change the wording of this warning to be more clear or change the code so that the inferred numbers are pulled rather than the observed.
_In get_predicted_flow_metrics_online(comid, wyt = wyt, fill_na_p10 = fill_na_p10) : Flow metric data from API contained duplicated records for some flow metrics that we automatically removed. This is a data quality issue in the predicted data - it can occasionally produce incorrect results - check the values of the predicted flow metrics at [https://flows.codefornature.org]__
This isn’t an issue if you pull metrics from the NFD. When you download them directly from the NFD, you are provided with both the inferred and the observed numbers.