Closed ccolonescu closed 5 years ago
I do not understand the second example:
Round8 <- import_rounds(8)
recode_missings(Round8)
data.frame(tail(attr(Round8$edulvlb, "labels"), 5))
In this example, line 2 is not doing anything. The example is strictly equivalent to the first one.
In the third example, the reason for the disappearing labels is there:
… at line 103.
It is indeed a good question why the labels need to be removed, on top of recoding the missing values to NA
, which is what the function claims it does.
It seems to me the function could work without that line at all.
Maybe @cimentadaj can help more, as he coded the function.
Thanks for the feedback! This is indeed a design choice. I can't remember clearly why I removed the labels to be honest. Having said that, I'm not sure whether the recode_missings
function is still useful.
From what I've seen, the ESS is now recoding these values automatically into the standard Stata missing values (such as .a
, .b
, etc..) and the latest haven
version supports these missings and automatically recodes them to missing values. See for example:
library(essurvey)
set_email("cimentadaj@gmail.com")
tst <- import_country("Spain", 1)
#> Downloading ESS1
#>
|
| | 0%
|
|=== | 5%
|
|======= | 11%
|
|=========== | 16%
|
|============== | 22%
|
|================== | 27%
|
|===================== | 33%
|
|========================= | 38%
|
|============================ | 43%
|
|================================ | 49%
|
|=================================== | 54%
|
|======================================= | 60%
|
|========================================== | 65%
|
|============================================== | 71%
|
|================================================= | 76%
|
|===================================================== | 81%
|
|======================================================== | 87%
|
|============================================================ | 92%
|
|================================================================ | 98%
|
|=================================================================| 100%
head(tst$edulvla, 200)
#> <Labelled double>: Highest level of education
#> [1] 1 3 2 2 1 2 1 5 3 3 5
#> [12] 2 1 1 1 1 1 1 1 1 1 2
#> [23] 2 1 1 1 1 1 1 3 4 3 5
#> [34] 1 3 3 3 1 1 2 3 2 2 3
#> [45] 3 1 3 1 5 1 1 1 1 3 1
#> [56] 1 1 1 1 1 1 5 3 1 5 2
#> [67] 3 3 1 3 1 1 2 3 1 1 3
#> [78] 3 1 4 5 1 3 3 3 1 1 2
#> [89] 2 1 1 3 2 3 5 3 5 3 1
#> [100] 2 5 1 5 1 1 1 3 5 3 5
#> [111] 2 1 4 1 3 1 2 1 3 1 2
#> [122] 1 1 3 2 2 1 1 5 4 4 1
#> [133] 1 3 1 1 3 2 1 3 3 1 1
#> [144] 1 1 3 1 1 1 1 5 1 2 2
#> [155] 1 3 1 1 2 1 1 1 2 1 1
#> [166] 1 2 1 5 1 3 3 3 2 2 4
#> [177] 2 2 4 3 3 2 3 1 2 NA(b) 1
#> [188] 2 3 1 2 2 5 1 2 3 1 3
#> [199] 1 3
#>
#> Labels:
#> value label
#> 0 Not possible to harmonise into 5-level ISCED
#> 1 Less than lower secondary education (ISCED 0-1)
#> 2 Lower secondary education completed (ISCED 2)
#> 3 Upper secondary education completed (ISCED 3)
#> 4 Post-secondary non-tertiary education completed (ISCED 4)
#> 5 Tertiary education completed (ISCED 5-6)
#> 55 Other
#> NA(b) Refusal
#> NA(c) Don't know
#> NA(d) No answer
These are now coded as NA(b)
, etc... instead of the old 777
, etc... values. To make this clear, I've added a minimum version to the haven
package in the DESCRIPTION
and added a description of this in the documentation of recode_missings
.
I would expect that the following lines do not recode missingness due to, for instance, 'Refusal." Instead, I see NA for 'Refusal.'
tail.attr.Round8.edulvlb...labels....5.
The following code behaves as expected:
tail.attr.Round8.edulvlb...labels....5.
The following unexpectedly loses the missingness categories "Refusal", "Don't know", and "No answer" altogether. I would expect the output to be the same as above.
ISCED 6, doctoral degree 800 Other 5555
Packages ---------------------------------------------------- package * version date lib source
essurvey * 1.0.2 2018-08-23 [1] CRAN (R 3.5.2)