Closed pwildenhain closed 6 years ago
Hi Paul
This is a very good idea. I guess it could all be done using base R functions like cut(). It might take some thought and effort to make the period argument fool-proof to avoid problems when the x-variable is not a datatime object.
Another consideration is how far qicharts2 should go to help do actual data manipulation. I have a feeling that I have already stretched it a bit too far with automatic aggregation of subgroup data and "elegant" handling of missing values. This is clearly helpful for the everyday use of qicharts but may potentially cause problems when the user is not aware of what is going on under the hood. For example, if you have subgroups > 1, qic() automatically calculates the mean of each subgroup. But sometimes you want the sum, and if you are not on your toes and specify this using the agg.fun argument, you might not get what you want and you might not even notice. The purist approach would be to mandate the user to prepare and clean data before even considering putting them on an SPC chart.
But again, I think this is a very good idea and I'll look into it.
Kind regard Jacob
2018-05-08 3:53 GMT+02:00 pwildenhain notifications@github.com:
We recently released an internal R package for my Quality Improvement team that heavily relies on qic() for it's spc chart functionality. In doing this we added something that I wanted to pitch to you.
We work almost exclusively with logitudinal analyses. To save our team from messing with dplyr::mutate() and lubridate, we decided to utilize functions from the tibbletime package to abstract away the process of date manipulation. Here's a vignette https://business-science.github.io/tibbletime/articles/TT-04-use-with-dplyr.html thats shows how well tibbletime works with dplyr.
The end product looks something like this:
qic(data = data, x = date_column, y = metric_column, n = n, period = "monthly")
Where the period argument does the date manipulation on x for you.
If you like this idea I'd be happy to dive more into the specifics and figure out if/how this best fits into your existing API.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/anhoej/qicharts2/issues/11, or mute the thread https://github.com/notifications/unsubscribe-auth/AEQ_xEQ0fessAbbvjdqzS4PP9a3KxFjaks5twPqwgaJpZM4T15Zl .
-- Venlig hilsen Jacob Anhøj
I'm sure that's a challenging balance to achieve; how much to do in qic()
as opposed to user pre-processing.
I think something that tibbletime
offers over the base R approach (i.e cut()
) is the flexibility in defining your time period. For example, with their collapse_by()
function you could input month
, monthly
, quarterly
, 12 weeks
, etc. to create the time period for aggregation. I can also understand that you might be hesitant to add a package dependency.
Based off the package API , I would propose period = NULL
as the default, and then executing a helper function right after data frame prep, but before aggregation:
# Prepare data frame
d <- data.frame(x, y, n, notes, facets, cl, target)
d <- droplevels(d)
# Date Manipulation
if (!missing(period)) {
d <- date_helper(d, period)
}
# Aggregate data and perform analyses
d <- qic.agg(d, got.n, part, agg.fun, freeze, exclude,
chart.fun, multiply, dots.only, chart, y.neg)
where date_helper()
mutates x
according to period
(this is similar to how we handled this in our enterprise R package).
Thanks for hearing me out, looking forward to your decision.
Thanks again. Will consider.
Please check the latest dev version.
Example:
d <- data.frame(x = seq(Sys.Date(), length.out = 365, by ='day'),
y = rnorm(365))
qic(x, y, data = d)
qic(x, y, data = d, x.period = 'week')
qic(x, y, data = d, x.period = '2 weeks')
qic(x, y, data = d, x.period = 'month')
qic(x, y, data = d, x.period = 'quarter')
Wow that was lighting fast! I had no idea that cut()
was that flexible, thats awesome.
I installed and tested it on some of our data and it worked really well, thanks for adding this :1st_place_medal:
Thank you for the idea. Yes, cut() is really the Swiss knife of datetime manipulation. Only thing to remember is to convert the output back to datetime. For the same reason, I never really needed lubridate and its descendants.
Closing this. Keep the good ideas coming.
We recently released an internal R package for my Quality Improvement team that heavily relies on
qic()
for it's spc chart functionality. In doing this we added something that I wanted to pitch to you.We work almost exclusively with logitudinal analyses. To save our team from messing with
dplyr::mutate()
andlubridate
, we decided to utilize functions from thetibbletime
package to abstract away the process of date manipulation. Here's a vignette thats shows how welltibbletime
works with dplyr.The end product looks something like this:
Where the
period
argument does the date manipulation onx
for you.If you like this idea I'd be happy to dive more into the specifics and figure out if/how this best fits into your existing API.