gergness / srvyr

R package to add 'dplyr'-like Syntax for Summary Statistics of Survey Data
214 stars 28 forks source link

Feature request: add subpopulation to printing of survey object #163

Open szimmer opened 1 year ago

szimmer commented 1 year ago

Let's say I create two survey objects: dstrata and dstrata_mod where dstrata_mod is created by using filter to make a subpopulation. The printed objects look the same. It'd be nice to have some indicator that the filtering has been done.

library(srvyr)
#> 
#> Attaching package: 'srvyr'
#> The following object is masked from 'package:stats':
#> 
#>     filter

data(api, package="survey")

dstrata <- apistrat %>%
  as_survey_design(strata = stype, weights = pw)

dstrata
#> Stratified Independent Sampling design (with replacement)
#> Called via srvyr
#> Sampling variables:
#>  - ids: `1`
#>  - strata: stype
#>  - weights: pw
#> Data variables: cds (chr), stype (fct), name (chr), sname (chr), snum (dbl),
#>   dname (chr), dnum (int), cname (chr), cnum (int), flag (int), pcttest (int),
#>   api00 (int), api99 (int), target (int), growth (int), sch.wide (fct),
#>   comp.imp (fct), both (fct), awards (fct), meals (int), ell (int), yr.rnd
#>   (fct), mobility (int), acs.k3 (int), acs.46 (int), acs.core (int), pct.resp
#>   (int), not.hsg (int), hsg (int), some.col (int), col.grad (int), grad.sch
#>   (int), avg.ed (dbl), full (int), emer (int), enroll (int), api.stu (int), pw
#>   (dbl), fpc (dbl)

dstrata_filt <- dstrata %>%
  filter(stype=="E") 
dstrata_filt
#> Stratified Independent Sampling design (with replacement)
#> Called via srvyr
#> Sampling variables:
#>  - ids: `1`
#>  - strata: stype
#>  - weights: pw
#> Data variables: cds (chr), stype (fct), name (chr), sname (chr), snum (dbl),
#>   dname (chr), dnum (int), cname (chr), cnum (int), flag (int), pcttest (int),
#>   api00 (int), api99 (int), target (int), growth (int), sch.wide (fct),
#>   comp.imp (fct), both (fct), awards (fct), meals (int), ell (int), yr.rnd
#>   (fct), mobility (int), acs.k3 (int), acs.46 (int), acs.core (int), pct.resp
#>   (int), not.hsg (int), hsg (int), some.col (int), col.grad (int), grad.sch
#>   (int), avg.ed (dbl), full (int), emer (int), enroll (int), api.stu (int), pw
#>   (dbl), fpc (dbl)

Created on 2023-07-15 with reprex v2.0.2

Related SUDAAN example extract below with bold for emphasis for some ideas.

proc regress data=temp1 filetype=sas design = jackknife; weight rakedw0;
jackwgts rakedw1--rakedw80 / adjjack=1;
model ae13 = ae14 racehpra; subpopn srsex = 1; subgroup racehpra; levels 4; run; Number of observations read : 55428 Weighted count: 23847415 Observations in subpopulation : 23002 Weighted count: 11631728 Observations used in the analysis : 3744 Weighted count: 2522055 Denominator degrees of freedom : 80

Maximum number of estimable parameters for the model is 5 Weighted mean response is 3.133033

Multiple R-Square for the dependent variable AE13: 0.231226 Variance Estimation Method: Replicate Weight Jackknife Working Correlations: Independent Link Function: Identity Response variable AE13: Number of drinks on the days drinking alcohol For Subpopulation: SRSEX = 1