Closed Schakel17 closed 1 year ago
Hi @Schakel17,
I'm not familiar with SAV and SPSS formats. Can you send a sample of such file? (a small one)
Hi @ddotta
I took a file from the SPSS sample folder with user defined missings (e.g., variable 61, named "reason"). To take into account user defined missings I do the following with the R-package haven: df <- read_sav(filename, user_na = TRUE). Only SPSS allows user defined missings, according to the haven-package. I have no experience with SAS and Stata. If user_na = FALSE (default value of haven), haven will convert all user defined missings to NA. Btw, PSPP can be used to view the attached sample file, but it is inferior tot SPSS.
Sample file: customer_dbase.zip
Hi @Schake17 and thanks for using parquetize,
With your sample SPSS file :
library(haven)
tableF <- haven::read_sav(filename,col_select = "reason", user_na = FALSE)
tableT <- haven::read_sav(filename, col_select = "reason", user_na = TRUE)
> class(tableF$reason)
[1] "haven_labelled" "vctrs_vctr" "double"
> class(tableT$reason)
[1] "haven_labelled_spss" "haven_labelled" "vctrs_vctr" "double"
I have the same NA in R but I guess it works with this?
We could add a specific SPSS format parameter to table_to_parquet()
, it's understandable.
What do you think @nbc?
Hi @ddotta,
I think you do not want specific functions for SPSS, SAS, and Stata. Then I would prefer a specific SPSS format argument which is only effective when the input file has the extension .sav.
Hi @ddotta,
I checked the adjusted function table_to_parquet() with user_na=TRUE for an SPSS-file. I receive the following error: "Error in write_parquet(data, sink = path_to_parquet, compression = compression, : unused argument (user_na = TRUE)". What is the problem?
Hi @Schakel17,
I can't reproduce it. This code works on my computer.
table_to_parquet(path_to_file = "U:/customer_dbase.sav",
path_to_parquet = "U:/customer_dbase.parquet",
user_na = TRUE)
Are you sure you have the latest version of parquetize? And can you provide a reproducible example?
Hi @ddotta,
I think I did not have the lastest version. After deleting the old installation I don't receive any error anymore. The output is exactly the same as the input.
Great news!
Default argument "user_na" of haven::read_sav and haven::read_spss is FALSE. I would like to have the option to overrule this argument or set the default value to TRUE as user defined missings in the .sav file are currently converted to NA.