Closed fsolt closed 1 year ago
I have some questions on working on double checking survey items:
Some notes as revising:
[x] 1. wvs4_swe -> I don't find evidence that questions are only asked for Sweden (https://www.worldvaluessurvey.org/WVSDocumentationWV4.jsp). "WVS4 Results by country" document shows that v208 and v76 were asked to other countries as well. We may want to change the dataset name to wvs4..?
[x] 2. wvs_combo neigh2's variable is confusing... I am not sure how to check this variable.
[ ] 3. I figured how to access the pew archive data, but this also provides additional questions/suggestions:
[x] 1) For "uspew1999_nii", I only found the codebook (https://www.pewresearch.org/wp-content/uploads/sites/4/legacy-pdf/54.pdf). Couldn't figure out the location/link of actual data.
[x] 2) Cannot figure specific dataset realted to "uspew2005_news" from (https://www.pewresearch.org/politics/datasets/?_yr=2005).
[x] 3) For "uspew2006_nii", the pew has separate monthly data. Some data includes gay rights related questions, while others do not (e.g., March and June have a gay marriage question). I edited and added survey name on 'surveys_gm' file. I will make relevant changes on 'surveys_data' file.
[x] 4) For "uspew2008_rel", download button is disabled (https://www.pewresearch.org/politics/dataset/august-2008-religion-survey/).
[x] 5) For "uspew2010_rel", can't figure out specific data from (https://www.pewresearch.org/politics/datasets/?_yr=2010).
[x] 6) For "uspew2011_typo", can't find related question (https://www.pewresearch.org/politics/dataset/2011-political-typology-callback-survey/).
@hey-ikon
Some notes as revising:
- [x] 1. wvs4_swe -> I don't find evidence that questions are only asked for Sweden (https://www.worldvaluessurvey.org/WVSDocumentationWV4.jsp). "WVS4 Results by country" document shows that v208 and v76 were asked to other countries as well. We may want to change the dataset name to wvs4..?
wvs4_swe
exists because this particular country-year survey was left out of the wvs_combo
file, apparently by mistake since it is still included in the updated wave 4 file.
- [x] 2. wvs_combo neigh2's variable is confusing... I am not sure how to check this variable.
The stem is "On this list are various groups of people. Could you please mention any that you would not like to have as neighbors? Homosexuals [also asked: Immigrants/foreign workers]" so a "mentioned" response means that R wouldn't want members of the named group as neighbors (i.e., is expressing homophobia or xenophobia) and "not mentioned" means R is at least okay with that group in this context. Does that resolve the confusion?
- [ ] 3. I figured how to access the pew archive data, but this also provides additional questions/suggestions:
- [x] 1) For "uspew1999_nii", I only found the codebook (https://www.pewresearch.org/wp-content/uploads/sites/4/legacy-pdf/54.pdf). Couldn't figure out the location/link of actual data.
The data are here: https://www.pewresearch.org/politics/dataset/september-1999-news-interest-index/ but tbh I think we should use Roper when possible because a. they seem to value backwards compatibility more and don't mess with their website as much as Pew, and b. they have an SPSS version while Pew only has ASCII. The link for Roper is https://ropercenter.cornell.edu/ipoll/study/31095707
- [x] 2) Cannot figure specific dataset realted to "uspew2005_news" from (https://www.pewresearch.org/politics/datasets/?_yr=2005).
It's the July News Interest Index, so it really should be called uspew2005_07nii
for consistency. Here it is at Roper: https://ropercenter.cornell.edu/ipoll/study/31095851
- [x] 3) For "uspew2006_nii", the pew has separate monthly data. Some data includes gay rights related questions, while others do not (e.g., March and June have a gay marriage question). I edited and added survey name on 'surveys_gm' file. I will make relevant changes on 'surveys_data' file.
Perfect. I guess you see the existing 'uspew2006_nii' was the March one; Roper has it here: https://ropercenter.cornell.edu/ipoll/study/31095871 . The June one is here: https://ropercenter.cornell.edu/ipoll/study/31095873
- [x] 4) For "uspew2008_rel", download button is disabled (https://www.pewresearch.org/politics/dataset/august-2008-religion-survey/).
Roper: https://ropercenter.cornell.edu/ipoll/study/31095930 -- please rename as uspew2008_08rel
- [x] 5) For "uspew2010_rel", can't figure out specific data from (https://www.pewresearch.org/politics/datasets/?_yr=2010).
Roper: https://ropercenter.cornell.edu/ipoll/study/31096004 -- please rename as uspew2010_07rel
- [x] 6) For "uspew2011_typo", can't find related question (https://www.pewresearch.org/politics/dataset/2011-political-typology-callback-survey/).
It's the original survey, not the callback. Roper: https://ropercenter.cornell.edu/ipoll/study/31096056
Fred @fsolt , in the commit 5f656a8f1e202bd9dd39dfa8b822112878becd88, here's the list of surveys I cannot access the data. I'll appreciate it so much if you can share the data and codebook files of the missing ones! Thanks!
@sammo3182
Fred @fsolt , in the commit 5f656a8, here's the list of surveys I cannot access the data. I'll appreciate it so much if you can share the data and codebook files of the missing ones! Thanks!
I've moved all these into the dropbox. I've left duplicate listings unchecked in case they actually mean something that I haven't figured out. If you can't get into the dropbox or there are other issues, please let me know.
@Tyhcass
- [x] dkes2005, dkes2007, cannot find raw data from original website
- [x] gallup_uk1981, gallup_uk1985, gallup_uk1986, original data is removed from the website
I've added these to the dropbox dcpo_datasets/dataset_requests
@Tyhcass
- [x] bsa2017, option includes (Depends/varies) 6
- [x] cnn199301, cnn199406, cnn199810,cnn200309, option includes "not sure"
- [x] eb691, eb712, eb774, eb834, option includes "indifferent"
We never include answers that respondents volunteered but weren't offered.
@Tyhcass
- [ ] gallup_us198606, gallup_us198703, question number should be q10a, and q05c. For gallup_us198206,gallup_us198606, gallup_us198703, need to double-check Weight since mean Weight is not 1.
Moved to https://github.com/fsolt/DCPOtools/issues/59
That issue is now closed, but the weight variables should still be double-checked
Next step: check questions by item
question_text
is similar (think of could it be the same question translated differently) and should not be splitaccept2
and accept2a
for example: accept2
question is "Some people feel that homosexuality is a lifestyle that should be accepted by society. Other feel that homosexuality is a lifestyle that should be discouraged by society. Which comes closer to your viewpoint, the first position or the second?" while accept2a
is "Do you feel that homosexuality should be considered as an acceptable alternative life-style or not?"cbsnyt199302
has "Do you feel that homosexuality should be considered as an acceptable alternative lifestyle or not?" which was originally coded as accept2
but really should be accept2a
because it is only missing the hyphen.
OTOH, these two items really might be best grouped together...*OTOH: on the other hand wrt: with regard to fwiw: for what it's worth
@byngdeuk: marry
@hey-ikon: accept
and adopt
@Tyhcass: approve
to free
(alphabetically)
@sammo3182: hioff
to strength5
(alphabetically, not including marry
)
Finishing First step :)
I'm done with the "rows 180-269" !
Related to the "marry",
All classifications are reasonable to me! I will double-check after all the questionnaires are updated. For instance, "bsa2007, marry5, c(5:1)" is not updated yet!
Except "abcwapo201402, marry41, c(4:1), "Do you or does anyone in your house own a gun, or not?" I think this questionnaire is not related to gray rights :)!
Except "abcwapo201402, marry41, c(4:1), "Do you or does anyone in your house own a gun, or not?" I think this questionnaire is not related to gray rights :)!
Ha! I don't know where that text comes from, but abcwapo201402
(that is, USABCWASH2014-1159
in Roper) has "Overall, do you support or oppose allowing gays and lesbians to marry legally" for q15
I've entered the correct text in https://github.com/fsolt/dcpo_gayrights/commit/782430ae1475021c9dc3f55cccc7a1a2e26cea87.
Re aes1987, aes2007, anes_combo, anpas1979 (@byngdeuk: "The initial response_categories was(3,1)being different from the codebook"), I confirmed on the survey questionnaires (that is, the documents the survey-takers used to give the questions) that the third responses were volunteered. We never use volunteered categories (for better or worse), so these are just two-category questions. https://github.com/fsolt/dcpo_gayrights/commit/acc7708817ba6d796d7968f9ebec9ccb8c5816dc I suspect that this is the same issue that @hey-ikon ran into with that streak of cnn surveys, but I haven't gone through those yet
Check Questions by Item
General question
Notes on accept
Notes on adopt
Other updates
Finish checking hioff
to strength5
(alphabetically, not including marry
), two issues:
run10
instead of marry
?I didn't see text problem or grouping problem from item approve to free on my end.
https://github.com/fsolt/dcpo_gayrights/blob/master/data-raw/surveys_gm.csv
done :)
I think we are now at double check stage. On my perspective, I couldn't find any error.
for @fsolt: 1. arbitrate zero in agree, 2. triple check surveys that @Tyhcass can't find
Okay, per our discussion this morning, our next step is to divide up
data-raw/surveys_gm.csv
and double-check each entry.This means you will have to:
variable
is correct (named as it appears in the dataset, not necessarily the codebook--sometimes they are different!--and in all lowercase),values
are correctly entered (from least tolerant response to most tolerant response),question_text
as it appears in the codebook,response_categories
as they appear in the codebook (highest and lowest is okay for Likert scales; see gender_roles for examples),agreed
equals 1 if you saw no problems withvariable
andvalues
, 0 if there was a problemsurveys_immigration.csv
before moving on to the next survey (next week, I'll assign any leftover surveys--surveys that only include immigration items--after we do all the surveys that appear in both files)@sammo3182: rows 1-89 @Tyhcass: rows 90-179 @byngdeuk: rows 180-269 @hey-ikon: rows 270-359
If you're comfortable with RStudio and GitHub, please go ahead and commit your updates to the file--just remember to pull before you push! (But don't worry, I'll fix it if you forget.) If you haven't gotten comfortable yet, or you have any problem, save your changes to a new file (e.g.,
surveys_gm_hey-ikon.csv
) and push that, then I'll combine them as I've done with your DCPOtools contributions.Things to do in next meeting after this is finished:
agree
item
codings