Open alexhallam opened 6 years ago
Hi Alex, thanks for the suggestion, let me take a look and see what we could do via the package.
Actually, I have another package called modeldb
that is able to fit models inside databases, and it also respects grouped data:
library(tidyverse)
library(gapminder)
library(modeldb)
gapminder %>%
group_by(country) %>%
select(lifeExp, year) %>%
linear_regression_db(lifeExp)
#> Adding missing grouping variables: `country`
#> # A tibble: 142 x 3
#> country `(Intercept)` year
#> <fct> <dbl> <dbl>
#> 1 Afghanistan -508. 0.275
#> 2 Albania -594. 0.335
#> 3 Algeria -1068. 0.569
#> 4 Angola -377. 0.209
#> 5 Argentina -390. 0.232
#> 6 Australia -376. 0.228
#> 7 Austria -406. 0.242
#> 8 Bahrain -860. 0.468
#> 9 Bangladesh -936. 0.498
#> 10 Belgium -340. 0.209
#> # ... with 132 more rows
This package works with tidypredict
but at this time, only with un-grouped data. If modeldb
fits the models that you ultimately need, maybe we should focus on adding grouped model capability in tidypredict
but based on the output from modeldb
.
Thanks!
Does this translate to sql text?
Not yet, that's the portion I need to code, but wanted first to find out if it would be useful for anyone.
I use grouped regressions often (aka Many Models). Though your package makes it mostly painless to create grouped model sql code, maybe it would be nice to have a convenience function to do this.
Hopefully the code below illustrates what I am thinking.
The first few entries are shown below. I am sure there are many ways to implement this in a cleaner fashion than what I did.
Would it be possible to add some convenience function to do some lifting for building case statements needed for grouped regression?