melff / memisc

Tools for Managing Survey Data, Creating Tables of Estimates and Data Summaries
https://melff.github.io/memisc
45 stars 8 forks source link

mysteric behaviour with factors and labels #18

Closed buhtz closed 7 years ago

buhtz commented 7 years ago

Please see this basi-R example with a data.frame.

d <- data.frame(a = sample(1:100))
d$a_strat <- cut(d$a, breaks=seq(1,100, by=10)) # stratify by 10
e <- d[,c('a_strat')]

> str(d$a_strat)
 Factor w/ 9 levels "(1,11]","(11,21]",..: 2 6 1 8 6 9 5 3 NA 9 ...
> str(e)
 Factor w/ 9 levels "(1,11]","(11,21]",..: 2 6 1 8 6 9 5 3 NA 9 ...

You see the labels for levels ar not lost. But when I do the same with a memisc:data.set they are lost.

d <- data.set(a = sample(1:100))
d$a_strat <- cut(d$a, breaks=seq(1,100, by=10))
e <- d[,c('a_strat')]

> str(d$a_strat)
 Factor w/ 9 levels "(1,11]","(11,21]",..: 4 9 3 1 NA 9 5 4 9 9 ...
> str(e)
Data set with 100 obs. of 1 variable:
 $ a_strat: Nmnl. item w/ 9 labels for 1,2,3,...  int  4 9 3 1 NA 9 5 4 9 9 ...

What is behind that behaviour?

melff commented 7 years ago

The labels are not lost, they simply do not show up individually in the output str(), which just summarily reports that there are 9 labels present. You can see the labels with codebook(e).

buhtz commented 7 years ago

That doesn't fix the problem. Please re-open and don't close a feauter before the opener told it is fixed. It is unpolite and wasting ressources.

The point is I need to use the labels for the ticks in some plots.

I can do this on the original 'data.set`.

> levels(d$a_strat)
[1] "(1,11]"  "(11,21]" "(21,31]" "(31,41]" "(41,51]" "(51,61]" "(61,71]"
[8] "(71,81]" "(81,91]"

But I can not do this

> levels(e)
NULL

You are right, I see kind of lables in the output of codebook(). But I can I get them as a list?

melff commented 7 years ago

GitHub's issue reporting infrastructure is for bug reports, not for support questions. Therefore it is not impolite to close an issue that is a support question. Please respect that offering an open source R-package is voluntary work and does not oblige those who contribute a package to offer support. Before you file issues, please make sure that you have read and understand the documentation of the package, otherwise you are wasting the time of those who offer an open source software like 'memisc'. If you had read the documentation thoroughly enough you may have realized that "data.set" objects are meant to contain "item" objects. Those in turn are designed to serve a particular purpose, that is to facilitate management of data generated from social science surveys. Items with value labels inside a "data.set" object will usually be transformed into factors if the "data.set" object that contains them is transformed into a data frame.