dekdekdek1 / BPS1214

BPS Code
0 stars 0 forks source link

Weighted #2

Open dekdekdek1 opened 6 years ago

dekdekdek1 commented 6 years ago

@Huade I believe I've finished this as well, I was having trouble because of Haven but my code was correct. I also changed the recode function from the unweighted portion so that I could use the recoded data here as well. I think this works, let me know if it's good!

Huade commented 6 years ago

This is great to hear, David.

Huade commented 6 years ago

Reopen this issue since SE does not match. It seems repweights were specified incorrectly.

It's a tricky case to include wta001-wta200 while exclude wta000. Try:

  1. Generate all rep weights we want: sprintf("wta%03d",seq(1,200,1))
  2. Get a data.frame with only these rep weights
  3. Use this data.frame as repweights instead of regex.
dekdekdek1 commented 6 years ago

Okay I can do that, I used the code that I did because that was the code that the guide you had sent me used

dekdekdek1 commented 6 years ago

I'm having some trouble with this new code. Can you explain to me how you calculated that my SE was wrong

Huade commented 6 years ago

Hi David, I calculated with both Stata and R and yield a different result. If you print srd, you will see that the current code only has 3 replication weights. The right number of replication weights is 200.

Huade commented 6 years ago

Here are my results. You can close this issue if matched:

> as.data.frame(svymean(~ gender_rc, srd))
                     mean          SE
gender_rcFemale 0.5580725 0.006001332
gender_rcMale   0.4419275 0.006001332
dekdekdek1 commented 6 years ago

I got .0055 for the SE, but otherwise it matched. Do you think this is a rounding difference or a code difference?

Huade commented 6 years ago

Here is my code:

library(survey)
library("readstata13")
library(dplyr)
library(car)

bps1214_raw <- read.dta13("bps1214.dta")
bps1214 <- bps1214_raw %>%
  mutate(gender_rc = recode(gender, "'Male' = 'Male'; 'Female' = 'Female'; else = NA"))

# Unweighted Percentages
prop.table(table(bps1214$gender_rc))

# Weighted Percentages
repweights <- bps1214[,sprintf("wta%03d",seq(1,200,1))]
srd <- svrepdesign(data = bps1214, weights = ~wta000,
                   type = "BRR", 
                   repweights = repweights, mse = T)
as.data.frame(svymean(~ gender_rc, srd))