Closed briatte closed 1 year ago
ESS now featured in Session 12 via a spatial viz example.
lmer
to get predicted probabilities: https://github.com/halhen/viz-pub/tree/master/ess-political-expressionz <- fs::dir_ls(regexp = "*.zip", recurse = TRUE)
v <- tibble()
for (i in z) {
cat(fs::path_file(i))
d <- unzip(i, exdir = tempdir())
f <- str_subset(d, "dta$")
cat(" ->", fs::path_file(f), "...\n")
d <- haven::read_dta(f)
n <- names(d)
n <- n[ n %in% c("essround", "cntry", "psu", "idno", "stratify", "stratum",
"dweight", "pspwght", "pweight", "prob", "anweight") ]
v <- bind_rows(v, tibble(file = f, n))
}
v %>%
mutate(file = fs::path_file(file)) %>%
pivot_wider(values_from = n, names_from = n) %>%
mutate(essround = as.integer(str_extract(file, "\\d+"))) %>%
arrange(essround)
# A tibble: 14 × 11
file essround idno cntry dweight pspwght pweight anweight prob stratum psu
<chr> <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 ESS1e06_6.dta 1 idno cntry dweight pspwght pweight NA NA NA NA
2 ESS4AT.dta 4 idno cntry dweight pspwght pweight NA NA NA NA
3 ESS4LT.dta 4 idno cntry dweight pspwght pweight NA NA NA NA
4 ESS4e04_5.dta 4 idno cntry dweight pspwght pweight NA NA NA NA
5 ESS5ATe1_1.d… 5 idno cntry dweight pspwght pweight NA NA NA NA
6 ESS5e03_4.dta 5 idno cntry dweight pspwght pweight NA NA NA NA
7 ESS6e02_5.dta 6 idno cntry dweight pspwght pweight anweight NA NA NA
8 ESS7SDDFe1_2… 7 idno cntry NA NA NA NA prob stratum psu
9 ESS7e02_2.dta 7 idno cntry dweight pspwght pweight NA NA NA NA
10 ESS8SDDFe01_… 8 idno cntry NA NA NA NA prob stratum psu
11 ESS8e02_2.dta 8 idno cntry dweight pspwght pweight anweight NA NA NA
12 ESS9ROe01.dta 9 idno cntry dweight pspwght pweight anweight prob stratum psu
13 ESS9e03_1.dta 9 idno cntry dweight pspwght pweight anweight prob stratum psu
14 ESS10.dta 10 idno cntry dweight pspwght pweight anweight prob stratum psu
Did some more tests, found weird things: https://github.com/gergness/srvyr/issues/157
Best guess, based on weighting guide:
as_survey_design(ids = psu,
strata = c(cntry, stratum),
nest = TRUE,
weights = anweight)
More tests with other designs. Conclusions:
psu
and stratum
for more accurate sampling error estimationanweight
for same reasonpsu + idno
is redundant with abovenest = TRUE
seems optional, but use it just in caselibrary(srvyr)
library(tidyverse)
ess9 <- readr::read_rds("https://f.briatte.org/temp/ess9_extract.rds")
# Andy Fugard's design
ess9_af1 <- ess9_extract %>%
as_survey_design(ids = idno, strata = cntry, nest = TRUE,
weights = pspwght)
# Fugard, using PSU
ess9_af2 <- ess9_extract %>%
as_survey_design(ids = psu, strata = cntry, nest = TRUE,
weights = pspwght)
# weighting guide + cntry
ess9_wg1 <- ess9_extract %>%
as_survey_design(ids = psu,
strata = c(cntry, stratum), # adding cntry
nest = TRUE,
weights = anweight)
# weighting guide, no cntry
ess9_wg2 <- ess9_extract %>%
as_survey_design(ids = psu,
strata = stratum, # as recommended
nest = TRUE,
weights = anweight)
# Vegetti's design -- implicit `ids = idno`
ess9_mv1 <- ess9_extract %>%
as_survey_design(weights = c(dweight, pspwght))
# Vegetti, using PSU
ess9_mv2 <- ess9_extract %>%
as_survey_design(ids = psu, weights = c(dweight, pspwght))
# Oberski's design -- implicit `nest = TRUE`
ess9_do <- ess9_extract %>%
as_survey_design(ids = psu, strata = stratum, weights = prob)
# Stefan Zins' design
# https://github.com/ropensci/essurvey/issues/39#issuecomment-507855290
ess9_sz <- ess9_extract %>%
as_survey_design(ids = psu, strata = stratum, weights = dweight)
# results -----------------------------------------------------------------
list("AF_idno" = ess9_af1, "AF_psu" = ess9_af2,
"WG_cntry" = ess9_wg1, "WG_stratum" = ess9_wg2,
"MV_idno" = ess9_mv1, "MV_psu" = ess9_mv2, "DO_psu" = ess9_do,
"SZ_psu" = ess9_sz) %>%
map_dfr(
~ .x %>%
filter(cntry == "GB") %>%
group_by(wltdffr_group) %>%
summarise(prop = srvyr::survey_mean(vartype = "se")),
.id = "design"
) %>%
filter(wltdffr_group == "Fair") %>%
arrange(-prop_se)
# A tibble: 8 × 4
design wltdffr_group prop prop_se
<chr> <fct> <dbl> <dbl>
1 MV_psu Fair 0.200 0.0204
2 MV_idno Fair 0.200 0.0166
3 WG_cntry Fair 0.196 0.0128
4 AF_psu Fair 0.196 0.0128
5 WG_stratum Fair 0.196 0.0125
6 SZ_psu Fair 0.190 0.0116
7 DO_psu Fair 0.191 0.0104
8 AF_idno Fair 0.196 0.0102
Availability of weighting vars:
anweight
but psu
and stratum
have to be retrieved from individual SDDFsanweight
, so even more work required… so, use ESS 9 or 10 in examples, or use 7 or 8 for one more example of a merge.
This one is complex enough to be its own issue…
Weighting guide
https://www.europeansocialsurvey.org/methodology/ess_methodology/data_processing_archiving/weighting.html https://www.europeansocialsurvey.org/docs/methodology/ESS_weighting_data_1_1.pdf
From the weighting guide, v1.1 (2020), page 7:
The guide asks for the creation of
anweight
('analytical weights') from the following variables:Once
anweight
exists, weighting guide instructs the following design:Details on analytical weights (ESS9+)
Quoting again from the weighting guide:
Full range of weighting variables, quoted from ESS9 codebook:
idno
- Respondent's identification numbercntry
- Countrydweight
- Design weightpspwght
- Post-stratification weight including design weightpweight
- Population size weight (must be combined withdweight
orpspwght
)anweight
- Analysis weightprob
- Sampling probabilitystratum
- Sampling stratumpsu
- Primary sampling unitNotes:
pspwght
includesdweight
anweight
is just the product ofpspwght
andpweight
prob
Discussions
https://github.com/InductiveStep/R-notes/issues/1 https://github.com/ropensci/essurvey/issues/39 https://github.com/ropensci/essurvey/issues/9#issuecomment-502459202
Second link right above recommends the following for ESS4:
Example: Andi Fugard, ESS9
Intermediate Quantitative Social Research, Birkbeck, University of London (2017-2020) https://inductivestep.github.io/R-notes/complex-surveys.html
Working on a multi-country example:
From the text:
Example: Federico Vegetti, ESS7
Introduction to Survey Statistics, University of Heidelberg, 2018 https://federicovegetti.github.io/teaching/heidelberg_2018/lab/sst_lab_day2.html
When working on countries separately:
When working on all countries together:
Example: Daniel Oberski, ESS7
http://asdfree.com/european-social-survey-ess.html
Working on a single country (Belgium) after merging the data to the SDDF file: