Closed lawrencehr closed 3 years ago
I would guess that your problem occurs before dataMaid
even gets its hands on the data. I am pretty sure that error message (Error: Can't convert to .
) is not ours, but if you could provide a minimal example that produces the error, I'd be happy to have a look at it.
Thanks so much for your quick reply!
So, I've tried it all again and the error I seem to be getting is actually Error : Can't convert <character> to <double>.
. From what I can tell, the dataMaid report/codebook reaches the haven dbl+label variable and something is stopping it from converting properly. It doesn't seem to be specific to variables with user defined NAs either. I tried this on two different surveys and the error occurs across data sets.
Here's my code:
library(haven)
library(tidyverse)
aes19_unrestricted <- haven::read_spss("XXXX\01468_p1.sav", user_na = TRUE) #import survey data with tagged values
taggedvar <- aes19_unrestricted %>%
select("STATE") #select first dbl+lbl variable
str(taggedvar)
#> tibble [4,000 x 1] (S3: tbl_df/tbl/data.frame)
#> $ STATE: dbl+lbl [1:4000] 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, ...
#> ..@ label : chr "State"
#> ..@ format.spss : chr "F2.0"
#> ..@ display_width: int 10
#> ..@ labels : Named num [1:8] 1 2 3 4 5 6 7 8
#> .. ..- attr(*, "names")= chr [1:8] "NSW" "VIC" "QLD" "SA" ...
dataMaid::makeDataReport(taggedvar)
#> Error : Can't convert <character> to <double>.
Created on 2020-08-04 by the reprex package (v0.3.0)
I agree, it does look like it might be makeDataReport()
that causes the error anyway.
Could you provide a dataset that I can use to try to reproduce the error? For example "01468_p1.sav" or a synthetic dataset that creates the error?
Of course: here's reprex of me creating the data. The dummydata is uploaded here in an .R object.
library(tidyverse)
aes19_unrestricted <- haven::read_spss("XXXXX/01468_p1.sav") #import survey data with tagged values
subsettedf <- aes19_unrestricted %>%
select("STATE")#select first dbl+lbl variable
dummydata <- subsettedf[1:4,]
str(dummydata)
#> tibble [4 x 1] (S3: tbl_df/tbl/data.frame)
#> $ STATE: dbl+lbl [1:4] 2, 2, 2, 2
#> ..@ label : chr "State"
#> ..@ format.spss : chr "F2.0"
#> ..@ display_width: int 10
#> ..@ labels : Named num [1:8] 1 2 3 4 5 6 7 8
#> .. ..- attr(*, "names")= chr [1:8] "NSW" "VIC" "QLD" "SA" ...
dataMaid::makeDataReport(dummydata)
#> Error : Can't convert <character> to <double>.
#> Error in `.rowNamesDF<-`(x, value = value) : invalid 'row.names' length
#> Data report generation is finished. Please wait while your output file is being rendered.
#>
#> Is dataMaid_dummydata.docx open on your computer? Please close it as fast as possible to avoid problems!
Created on 2020-08-05 by the reprex package (v0.3.0)
I think something wen't wrong with creating the data. .R
-files are for scripts, not data. The following code should do the job for you and place the dataset in a file called dummydata.rda
in your working directory:
library(tidyverse)
aes19_unrestricted <- haven::read_spss("XXXXX/01468_p1.sav") #import survey data with tagged values
subsettedf <- aes19_unrestricted %>%
select("STATE")#select first dbl+lbl variable
dummydata <- subsettedf[1:4,]
save(list = "dummydata", file = "dummydata.rda")
Apologies - I'm pretty new to posting issues on Github! This is my first one :)
This should be the correct file extension.
No problem! Thanks for posting the issue. I think I found the issue and I will have a look at fixing it sometime soon. I will let you know when there's a solution.
Notes to self:
dataMaid:::dataMaid_haven_replace_with()
called from dataMaid:::dataMaid_as_factor()
, which is called from e.g. uniqueValues()
. dataMaid:::dataMaid_haven_replace_with()
with a newer version from haven
(https://github.com/tidyverse/haven/blob/af214f7b516781053b430e4c3f1b5083bb86fd97/R/as_factor.R), but we need to test it more thoroughly and add relevant unit tests
Hi, I'm wondering if it's possible to add support for dataframes with user defined NA tags. When you import spss data using haven::read_spss(..., user_na = T) the user defined NAs are preserved. However, when you run this data through dataMaid, I get the following error:
Error: Can't convert to .
A lot of survey data has multiple types of missing values (not answered, invalid response, skipped), so there is a strong use case for this.