Closed beamgau closed 9 years ago
The error is in cbind(x, as_factor(x))
. It creates a matrix with x
and the equivalent of as.integer(as_factor(x))
. Then you transform a labelled vector to a factor, the original codes are lost (in R, you can't control the underlined integer codes associated to a label in a factor).
If I check your x
vector. The first value is 1021. According to your definitions, the associated label is "Mistel (Viscum album) (Interesse geäußert)".
If I do > as_factor(x)[1]
, I get [1] Mistel (Viscum album) (Interesse geäußert)
. Which is correct. The appropriate value label has been assigned to the appropriate factor level.
Hm, doesn't work for me. I get the wrong labels. Seems to have something to do with 1/0?
library(haven)
x = structure(c(0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1),
labels = structure(c(0,1), .Names = c("Nein", "Ja")), class = "labelled")
> x
<Labelled>
[1] 0 0 0 0 0 0 1 0 0 0 1
Labels:
Nein Ja
0 1
but as_factor codes 0 as Ja and 1 as NA:
data.frame(x, as_factor(x))
x as_factor.x.
1 0 Ja
2 0 Ja
3 0 Ja
4 0 Ja
5 0 Ja
6 0 Ja
7 1 <NA>
8 0 Ja
9 0 Ja
10 0 Ja
11 1 <NA>
str(x)
Class 'labelled' atomic [1:11] 0 0 0 0 0 0 1 0 0 0 ...
..- attr(*, "labels")= Named num [1:2] 0 1
.. ..- attr(*, "names")= chr [1:2] "Nein" "Ja"
sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252
attached base packages:
[1] grid stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] lubridate_1.3.3 tableone_0.7.1 colorRamps_2.3
[4] RColorBrewer_1.1-2 ggplot2_1.0.1 reshape_0.8.5
[7] pander_0.5.2 plyr_1.8.3 zoo_1.7-12
[10] haven_0.2.0 knitr_1.10.5
loaded via a namespace (and not attached):
[1] Rcpp_0.12.0 survey_3.30-3 magrittr_1.5 MASS_7.3-43
[5] munsell_0.4.2 colorspace_1.2-6 lattice_0.20-33 stringr_1.0.0
[9] tools_3.2.2 gtable_0.1.2 e1071_1.6-7 htmltools_0.2.6
[13] class_7.3-13 yaml_2.1.13 digest_0.6.8 formatR_1.2
[17] reshape2_1.4.1 codetools_0.2-14 evaluate_0.7.2 memoise_0.2.1
[21] rmarkdown_0.7 labeling_0.3 stringi_0.5-5 scales_0.2.5
[25] proto_0.3-10
I have the same issue in a lubuntu vm. can you reproduce this behaviour?
thanks for looking into this!
It seems that you are using the stable version of haven
. For your tests, you should use the last GitHub version.
devtools::install_github("hadley/haven")
Secondly, the structure of labelled vectors has changed to take into account missing value.
It would be better to create your labelled vector with the appropriate labelled
function, i.e.
x <- labelled(c(0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1), c(Nein = 0, Ja = 1))
Do you still have this issue with the last dev version?
Ah, thanks, it works with the dev version!
structure(c(0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1),
labels = structure(c(0,1), .Names = c("Nein", "Ja")), class = "labelled")
was just what i got from dput().
Thanks again! felix
Hi,
i have a labelled column after an spss import, but s_factor seems to be somewhat broken, see example below. x and as_factor(x) seem to be totally different, i have no idea why. Would be great if you could look into this.
thanks, felix
R version 3.2.1 (2015-06-18) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1
locale: [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] haven_0.2.0
loaded via a namespace (and not attached): [1] tools_3.2.1 Rcpp_0.12.0