tidyverse / haven

Read SPSS, Stata and SAS files from R
https://haven.tidyverse.org
Other
423 stars 115 forks source link

Haven: The file's timestamp string is invalid #683

Closed AlejandroDGR closed 1 year ago

AlejandroDGR commented 2 years ago

I try to read SPSS format:

dataframe <- as.data.frame(read_sav("C:/~FilePath/Dataset.sav"))

and get:

Error in df_parse_sav_file(spec, encoding, user_na, cols_skip, n_max, : Failed to parse ~FilePath/Dataset: The file's timestamp string is invalid.

This has been posted in a closed issue (The file's timestamp string is invalid #488) and in a Stackoverflow question ([Difficulty with haven package Reading SPSS data in R](https://stackoverflow.com/questions/60006760/difficulty-with-haven-package-reading-spss-data-in-r)), but I have not found any solution yet.

gorcha commented 2 years ago

HI @AlejandroDGR, thanks for the bug report.

Are you able to share a copy of the file causing the issue so we can investigate?

AlejandroDGR commented 2 years ago

Of course @gorcha!

It happens to me with several files. All of them come from the Spanish National Sociological Centre, like this example

Luckily, they're a minority (the great majority of Spanish National Sociological Centre's files work fine).

gorcha commented 2 years ago

Perfect, thanks!

@evanmiller this is related to #488, but for SPSS files rather than Stata. The fix provided for #488 ignores invalid timestamps rather than throwing an error but only for Stata files, so an invalid timestamp will still throw an error in an SPSS file. I think the SPSS timestamp parsing code just needs to continue on failure like the Stata code?

JulianEGerez commented 1 year ago

I'm having this issue as well for a file from the same origin as @AlejandroDGR's. I'm wondering if the issue has been resolved as the problem appeared to have been identified and fixed for Stata files. @gorcha

gorcha commented 1 year ago

Hi @JulianEGerez,

We're waiting for PR WizardMac/ReadStat#277 to be merged in the underlying ReadStat library to resolve this. I've followed up with the maintainer over there, hopefully won't be too long.

philmikejones commented 1 day ago

Following this fix I don't get errors for .sav files anymore (at least that I've noticed). I have, however, run into a .por file not working. The workaround was to ask a friend with SPSS installed to open the .por, save as a .sav, and to open this file (presumably ignoring the invalid timestamp string).

@gorcha is this something that could be updated to ignore .por timestamps, too? Thanks.