Open etiennebacher opened 1 year ago
I think it's a good idea to look into ways to speed up grouped operations, if possible. But fundamentally, the srvyr
code is doing many, many more calculations than the 'dplyr' code. In 'srvyr', you're computing ~ 7,000 point estimates and then you're also computing estimated standard errors for those point estimates. The calculation of the standard errors is the hard part that requires a lot of calculation; getting the point estimates is easy.
Is there something you noticed inside the 'srvyr' code that you think is making it especially slow?
Hello, I have some survey data that contains a few tens of thousands of observations, and I have several groups. I'd like to compute the survey count per combination of groups, but it is much slower than the "non-survey" count using
dplyr
. I understand that you have to do extra steps and that you callsurvey
under the hood, but it seems to me that this difference in timing is due to the way groups of data are passed tosurvey_total()
.In the example below,
dplyr::count()
is near instantaneous, butsurvey_count()
takes more than a minute:Is it something that could be improved? Or maybe I missed something?
Thanks,