tidyverse / haven

Read SPSS, Stata and SAS files from R
https://haven.tidyverse.org
Other
424 stars 117 forks source link

Data saved with write_sav cannot be read by read_sav with certain combinations of data/labels #537

Open DavidLukeThiessen opened 4 years ago

DavidLukeThiessen commented 4 years ago

The below data is a bit contrived, I used dput() on a column of data imported from SPSS that was giving me the error and tried to reduce it as much as possible. There are probably still more simplifications that could be done, but I'm not familiar enough with R to do it.

This code illustrates the error.

library(haven)
save_path <- tempfile(fileext = ".sav")

test_data <- structure(list(x1 = structure(c("24_week_arm_1"),
                                           labels = c(`8 week` = "client_status_arm_1"),
                                           class = c("haven_labelled", "vctrs_vctr", "character"))),
                       class = c("tbl_df", "tbl", "data.frame"))
write_sav(test_data, save_path)
input_data <- read_sav(save_path)
# error, failed to parse

Changing the data sometimes eliminates the error, but I can't see any pattern to it.

test_data <- structure(list(x1 = structure(c("structname"),
                                           labels = c(`8 week` = "client_status_arm_1"),
                                           class = c("haven_labelled", "vctrs_vctr", "character"))),
                       class = c("tbl_df", "tbl", "data.frame"))
write_sav(test_data, save_path)
input_data <- read_sav(save_path)
# error, failed to parse

test_data <- structure(list(x1 = structure(c("name"),
                                           labels = c(`8 week` = "client_status_arm_1"),
                                           class = c("haven_labelled", "vctrs_vctr", "character"))),
                       class = c("tbl_df", "tbl", "data.frame"))
write_sav(test_data, save_path)
input_data <- read_sav(save_path)
# works

test_data <- structure(list(x1 = structure(c("client_status"),
                                           labels = c(`8 week` = "client_status_arm_1"),
                                           class = c("haven_labelled", "vctrs_vctr", "character"))),
                       class = c("tbl_df", "tbl", "data.frame"))
write_sav(test_data, save_path)
input_data <- read_sav(save_path)
# error, failed to parse

test_data <- structure(list(x1 = structure(c("client_status_arm_1"),
                                           labels = c(`8 week` = "client_status_arm_1"),
                                           class = c("haven_labelled", "vctrs_vctr", "character"))),
                       class = c("tbl_df", "tbl", "data.frame"))
write_sav(test_data, save_path)
input_data <- read_sav(save_path)
# works

Changing the label from "client_status_arm_1" to something else is similar. Sometimes it removes the error and sometimes it doesn't.

hadley commented 3 years ago

Can you please provide a minimal reprex (reproducible example)? The goal of a reprex is to make it as easy as possible for me to recreate your problem so that I can fix it: please help me help you! If you've never heard of a reprex before, start by reading about the reprex package, including the advice further down the page. Please make sure your reprex is created with the reprex package as it gives nicely formatted output and avoids a number of common pitfalls.

hadley commented 3 years ago

Here's a reprex:

library(haven)
path <- tempfile()

df <- data.frame(
  x1 = labelled("24_week_arm_1", labels = c(`8 week` = "client_status_arm_1"))
)
write_sav(df, path)
read_sav(path)
#> Error: Failed to parse /private/tmp/RtmpZyt0fb/file15f0f35aaaa64: Invalid file, or file has unsupported features.

Created on 2021-04-08 by the reprex package (v2.0.0)

@evanmiller I think this one might be for you too.