ceff-tech / ffc_api_client

An R client for the online Functional Flows Calculator API
https://ceff-tech.github.io/ffc_api_client
9 stars 3 forks source link

returning warning and message or NA instead of "less than 10 years of data...try again" #35

Open ryanpeek opened 4 years ago

ryanpeek commented 4 years ago

evaluate_gage_alteration() (and associated functions) currently return an error with the message "Less than 10 years of data...try again" if the gage has insufficient data. This is helpful, but also can be an issue for batch processing as it is a stop error and thus requires the user to wrap code in some sort of error function (i.e., I'm using possibly(get_percentiles, NA_real_) but there's others too like safely(), try()). If possible it would be preferable to return the warning message in user console, but return an actual NA value to the user environment. Not best way to do this but can look.

nickrsan commented 4 years ago

I think my preferred response to this would be a parameter that:

  1. defaults to a stop error (or equivalent error level) still,
  2. but can be switched to a warning + fill NA or even a warning + run data anyway

That way, people new to the package/process won't send data through and either think it's fine or just broken, but others can easily switch the behavior. Would something like this work for the workflow you're using? If not, I'm fine to adjust it, just wanted to note what I'm thinking here.

nickrsan commented 4 years ago

Also, I'd be fine with a PR that switches the current behavior to warn + fill as the default for now, then we could leave this issue open until I'm able to get the other behaviors in place (and I'd still use the PR code for warn and fill). Just wanted to not stand in your way if this is a big thing limiting your ability to use evaluate_gage_alteration.

ryanpeek commented 4 years ago

I don't know...I also like the default to stop error...since it's important to know about and should be the default. That's partly why I'm not sure it's as big a deal to fix this function in particular if I can get batch processing to work with the other functions that go into this one. I sort of feel like ultimately I'd rather have the flexibility of batching a bunch of gages or comids and pulling just the percentiles, or just the metrics as nice dataframes, than have something all lumped into one and then needs to be split back out again (from a batch perspective). Let's talk some more about this.

nickrsan commented 4 years ago

OK, I think the FFCProcessor object could help a lot with this because it will save state, so it will know what has been run/what hasn't and be able to run only what's needed to provide results, even if you go to just retrieve a specific item. If you instantiated an FFCProcessor object for each gage, you'd then be able to call something like get_percentiles, and it will run everything it needs to and just return the percentiles to you. Agree on discussing a bit more on what's useful. Thanks!

ryanpeek commented 4 years ago

@nickrsan Sorry to squish this in here...it's a bit broader and maybe worth a longer conversation:

In some senses it would be nice to be able to use these three functions and get back a dataframe for each (some of these already do this):

Currently we have to return the list, then convert to a dataframe, then use that to calc percentiles. From a batch processing standpoint, that puts more onus on the user. Which is fine if we show them the code to do these things.

Secondary functions are nice but may not be primary to analysis:

nickrsan commented 4 years ago

This feels fine for where to discuss! Sounds like you're suggesting a set of new shortcuts that handle specific portions of the workflow, but give you spots to intervene before moving to the next? That'd be fine by me - maybe we make a new shortcuts.R function that contains code that supports workflows rather than doing data manipulation itself?

Currently we have to return the list, then convert to a dataframe, then use that to calc percentiles. From a batch processing standpoint, that puts more onus on the user. Which is fine if we show them the code to do these things.

Just to confirm, you're only referring to this workflow in a batch processing context, right? Because I think we already have better methods for people not batching. I agree that evaluate_alteration should probably change functionality from handling everything to specifically saying whether something is altered or not, and we'd provide another function that does the shortcut that handles everything for people.

ryanpeek commented 4 years ago

Yup, I think that makes sense. Maybe we specifically can write a batch_ffc.R script that would take demonstrate how to step through and return these pieces as dataframes (easier to bind lots together if pulling same cols/colnames). Still need to make a few tweaks to make that possible, but makes sense to me.

On Wed, Jan 15, 2020 at 4:12 PM Nick Santos notifications@github.com wrote:

This feels fine for where to discuss! Sounds like you're suggesting a set of new shortcuts that handle specific portions of the workflow, but give you spots to intervene before moving to the next? That'd be fine by me - maybe we make a new shortcuts.R function that contains code that supports workflows rather than doing data manipulation itself?

Currently we have to return the list, then convert to a dataframe, then use that to calc percentiles. From a batch processing standpoint, that puts more onus on the user. Which is fine if we show them the code to do these things.

Just to confirm, you're only referring to this workflow in a batch processing context, right? Because I think we already have better methods for people not batching. I agree that evaluate_alteration should probably change functionality from handling everything to specifically saying whether something is altered or not, and we'd provide another function that does the shortcut that handles everything for people.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ceff-tech/ffc_api_client/issues/35?email_source=notifications&email_token=ABMX3MFBYOQHI6LY6PPM4CLQ56Q5NA5CNFSM4KHLNAO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJCIVXA#issuecomment-574917340, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABMX3MFQGLOYGXA7N2HUIZDQ56Q5NANCNFSM4KHLNAOQ .

--

"When we try to pick out anything by itself, we find it hitched to everything else in the universe."John Muir (My First Summer in the Sierra, 1911) ----------------------------------------------------- Ryan Peek, PhD (he/him) Aquatic Ecologist, Post-Doctoral Researcher Center for Watershed Sciences, UC Davis ryanpeek.github.io http://ryanpeek.github.io @riverpeek 530.754.5351

-----------------------------------------------------