ceopinio / CEOdata

CEOdata R package
5 stars 1 forks source link

changes in version 1.3 #10

Closed fred-udina closed 1 year ago

fred-udina commented 1 year ago

Hi, I've been using CEOdata with my students and I'm very happy with it. But now some student load the new version 1.3.0 and I see some changes in the way data is read and stored. I made a comparison that can be seen in the files attached, but in summary, using version 1.2 I got:

str(barceo$P43B_Q_TORRA)

##  dbl+lbl [1:2000]  0, 98, 98,  7,  0, 98,  6,  3,  5,  6,  5, 99,  0, 98,  ...
##  @ label        : chr "Valoració: Quim Torra"
##  @ format.spss  : chr "F2.0"
##  @ display_width: int 14
##  @ labels       : Named num [1:13] 0 1 2 3 4 5 6 7 8 9 ...
##   ..- attr(*, "names")= chr [1:13] "0" "1" "2" "3" ...
valTorra <- as.numeric(barceo$P43B_Q_TORRA) # cal convertir els valors a numerics.
table(valTorra)
## valTorra
##   0   1   2   3   4   5   6   7   8   9  10  98  99 
## 411  58 111 149 149 268 160 173 135  69  75 102  45

while with version 1.3:

str(barceo$P43B_Q_TORRA)

##  Factor w/ 13 levels "0","1","2","3",..: 1 12 12 8 1 12 7 4 6 7 ...
##  - attr(*, "label")= chr "Valoració: Quim Torra"

valTorra <- as.numeric(barceo$P43B_Q_TORRA) # cal convertir els valors a numerics.
table(valTorra)

## valTorra
##   1   2   3   4   5   6   7   8   9  10  11  12  13 
## 411  58 111 149 149 268 160 173 135  69  75 102  45

It appears that 1.3 is coding the values wrongly, the scale should be 0-10, not 1-11.

Please help me: I'm I doing anything wrong, or it is CEOdata 1.3 wrong? In both cases I use haven 2.5.1.

Seminari_2_CEO-1-2.pdf

Seminari_2_CEO-1-3.pdf

joelardiaca commented 1 year ago

Hello @fred-udina ,

It is true that, in the new version of the package CEOdata (1.3.0) there are some changes in the way data is read. Mainly, values are imported as factors (this did not happen before if just a single "reo" was imported). This might be useful to avoid data pre-processing for categorical variables, specially for R users that are less familiarized with labelled values but, as you pointed out, it might induce some errors when converting them to numeric.

There is a solution to that by solves this issue, that is using the following argument to read the data as before (labelled format).

barceo <- CEOdata::CEOdata(reo = "985", raw=TRUE)
xfim commented 1 year ago

Dear @fred-udina and @joelardiaca ,

Indeed, there was an initial issue with singular REOs, that were behaving inconsistently with the CEOdata() option. That is the initial issue.

After solving this in commit 59a39e1 (corresponding to the version that @fred-udina finds behaving differently from the past), the issue has been discussed and we have decided to stick with the pure-R variables by default, and make the raw = TRUE argument, which was possible to use from the very beginning, much more visible in the documentation, the cheatshet and the initial welcoming message of the package. This has been solved in commit aff968a, and the CRAN version 1.3.1 should incorporate all these changes. Please read the NEWS file for more information.

And thank you very much for reporting the issue.