Closed BERENZ closed 6 years ago
I have a similar problem since the latest haven-update, here's my error msg:
Error in df_parse_sav_file(spec, user_na) :
Failed to parse C:/Users/Daniel/Documents/Projekte/2014 DAVID2/Datenerhebung/Dateneingabe/Patienten/Pat_Dateneingabe20170913_pa.sav: Unable to allocate memory.
5.
stop(structure(list(message = "Failed to parse C:/Users/Daniel/Documents/Projekte/2014 DAVID2/Datenerhebung/Dateneingabe/Patienten/Pat_Dateneingabe20170913_pa.sav: Unable to allocate memory.",
call = df_parse_sav_file(spec, user_na), cppstack = structure(list(
file = "", line = -1L, stack = "C++ stack not available on this system"), .Names = c("file",
"line", "stack"), class = "Rcpp_stack_trace")), .Names = c("message", ...
4.
df_parse_sav_file(spec, user_na)
3.
read_sav(file, user_na = user_na)
2.
haven::read_spss(file = path, user_na = tag.na) at read_write.R#55
1.
read_spss("../Pat_Dateneingabe20170913_pa.sav")
Same issue with 1.1.1 (R 3.4.0 on win7). Reverting to 1.1.0 solves the problem. (found the old haven_1.1.0.zip on https://mran.microsoft.com/snapshot/2017-11-01/bin/windows/contrib/3.5/)
Yes, downgrading to 1.1.0 works. Older pkg versions can be found on CRAN as well, there's an archive link: https://cran.r-project.org/src/contrib/Archive/haven/
Yes, thanks. I found them but they are not compiled and so they require Rtools, which may not be convenient (and install_version has issue with our proxy)... :-) The *.zip are easier to install on Windows...
I also have the same problem. Unfortunately I can't share the dataset that caused the issue, and I'm hunting for a similar open data set that replicates the problem, but downgrading to haven v1.1.0 solved the issue for me:
devtools::install_version("haven", version = "1.1.0")
Update: It seemed to be a problem of the size of the file. As I was halving the data set to see which 'half' the problem was related to I was able to load the file regardless of how I sliced it when I removed enough columns or cases, which seems to fit with the error message. Very strange that the same file loads with v1.1.0.
I have the same problem (R 3.4.3 on Win10). read_sav leads to an error (see below) in haven 1.1.1, downgrading to v1.1.0 solves the issue. In my scenario, the error only shows when the database has more than env. 2000 variables.
Error in df_parse_sav_file(spec, user_na) : Failed to parse J:/Database_XY-Study_II.sav: Unable to allocate memory.
I don't think it's an issue of very large files. My dataset had ~ 600 observations and ~ 300 variables. That said, it would be very strange if you don't run into this issue for datasets with < 2000 columns?
I don't have any idea about the exact reason of the issue, but reducing my dataset from 5845 to 1970 variables (160 observations in each case) made my script run again. I thought this information could maybe be helpful. It doesn't seem to be an issue basing only on the number of variables. The running script when having only 1970 variables could somehow be related with deletion of variables, length of variable names or contents, specific characters etc.
I'll add my voice here. Haven v1.1.1 generates memory allocation error, while v1.1.0 does not, for same file.
Use this to revert to previous working version: remove.packages("haven") devtools::install_version("haven", version = "1.1.0", repos = "http://cran.us.r-project.org")
Any update on this issue? Having the same problem with not a very large .sav
file at all. Downgraded to 1.1.0 and it works now.
I have the exact same issue, also solved it by reverting to 1.1.0 (downloading from CRAN didn't work, but the link @dicorynia provided did).
Can you please try the latest development version? I can read the file attached to the initial issue without problems.
Works for me. 👍
sorry, please re-open this issue, i'm not sure it's fixed in the development version? here's a minimal reproducible block that works on v1.1.0
and fails with both cran and with the current dev version:
tf <- tempfile()
download.file( "https://assets.pewresearch.org/wp-content/uploads/sites/5/datasets/Sept07.zip" , tf , mode = 'wb' )
z <- unzip( tf , exdir = tempdir() )
x <- haven::read_sav( grep( "\\.sav$" , z , value = TRUE ) )
same failure in dev.. doesn't fail with v1.1.0
# Error in df_parse_sav_file(spec, encoding, user_na) :
# Failed to parse C:/Users/AnthonyD/AppData/Local/Temp/Rtmpao8ZKs/Sept07c.sav: Unable to allocate memory.
sessionInfo for run that works:
devtools::install_github("tidyverse/haven", ref = "v1.1.0")
library(haven)
sessionInfo()
# R version 3.4.3 (2017-11-30)
# Platform: x86_64-w64-mingw32/x64 (64-bit)
# Running under: Windows Server 2008 R2 x64 (build 7601) Service Pack 1
# Matrix products: default
# locale:
# [1] LC_COLLATE=English_United States.1252
# [2] LC_CTYPE=English_United States.1252
# [3] LC_MONETARY=English_United States.1252
# [4] LC_NUMERIC=C
# [5] LC_TIME=English_United States.1252
# attached base packages:
# [1] stats graphics grDevices utils datasets methods base
# other attached packages:
# [1] haven_1.1.0
# loaded via a namespace (and not attached):
# [1] compiler_3.4.3 magrittr_1.5 pillar_1.1.0 tibble_1.4.2 Rcpp_0.12.15
# [6] forcats_0.2.0 rlang_0.1.4
sessionInfo for version that fails:
devtools::install_github("tidyverse/haven")
library(haven)
sessionInfo()
# R version 3.4.3 (2017-11-30)
# Platform: x86_64-w64-mingw32/x64 (64-bit)
# Running under: Windows Server 2008 R2 x64 (build 7601) Service Pack 1
# Matrix products: default
# locale:
# [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
# [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
# [5] LC_TIME=English_United States.1252
# attached base packages:
# [1] stats graphics grDevices utils datasets methods base
# other attached packages:
# [1] haven_1.1.1.9000
# loaded via a namespace (and not attached):
# [1] Rcpp_0.12.15 digest_0.6.15 withr_2.1.1 R6_2.2.2 git2r_0.21.0 magrittr_1.5 pillar_1.1.0
# [8] httr_1.3.1 rlang_0.1.4 curl_3.1 devtools_1.13.4 forcats_0.2.0 tools_3.4.3 compiler_3.4.3
# [15] memoise_1.1.0 knitr_1.19 tibble_1.4.2
@evanmiller can you please take another look?
@ajdamico in the future, please don't include sessionInfo()
unless it's specifically requested — it's not usually useful.
will do, thanks. here are more examples of the problem.. same failure on cran and dev but success with 1.1.0
tf <- tempfile()
download.file( "http://assets.pewresearch.org/wp-content/uploads/sites/5/datasets/Iraq2003-2.zip" , tf , mode = 'wb' )
z <- unzip( tf , exdir = tempdir() )
x <- haven::read_sav( grep( "\\.sav$" , z , value = TRUE ) )
tf <- tempfile()
download.file( "http://assets.pewresearch.org/wp-content/uploads/sites/5/datasets/Oct01NII.zip" , tf , mode = 'wb' )
z <- unzip( tf , exdir = tempdir() )
x <- haven::read_sav( grep( "\\.sav$" , z , value = TRUE ) )
tf <- tempfile()
download.file( "http://assets.pewresearch.org/wp-content/uploads/sites/5/datasets/april01nii.zip" , tf , mode = 'wb' )
z <- unzip( tf , exdir = tempdir() )
x <- haven::read_sav( grep( "\\.sav$" , z , value = TRUE ) )
Those files now all open successfully for me - thanks for the bug report!
I am still having an issue with 1.1.1.9. I think my file might be quite a bit bigger than the examples: 231.7 MB. Unfortunately, I can't distribute it. Reverting to 1.1.0 is fine.
EDIT: I did look for a bigger public sav file, but the 30.9 MB one from GEM (http://www.gemconsortium.org/data/sets?id=aps) worked fine in both versions.
Error:
> packageVersion('haven')
[1] ‘1.1.1.9000’
> lntItemDF <- haven::read_sav('/data/jflournoy/lnt_pxvx/LT_wideAGT1234.sav')
Error in df_parse_sav_file(spec, encoding, user_na) :
Failed to parse /data/jflournoy/lnt_pxvx/LT_wideAGT1234.sav: Unable to allocate memory.
Okay:
> packageVersion('haven')
[1] ‘1.1.0’
> lntItemDF <- haven::read_sav('/data/jflournoy/lnt_pxvx/LT_wideAGT1234.sav')
The issue persist for me, too. I tried downgrading to 1.1.0, but in that case I receive another error:
devtools::install_version("haven", version = "1.1.0") spss <- haven::read_spss("data-raw/[filename].sav") Error in .Call("haven_df_parse_sav_file", PACKAGE = "haven", spec, user_na) : "haven_df_parse_sav_file" not available for .Call() for package "haven"
And with 1.1.1
Error in df_parse_sav_file(spec, user_na) : Failed to parse C:/ [...]_package/surveyreader/data-raw/[filename].sav: Unable to allocate memory.
The file is large, because it is a SurveyMonkey dump, not very efficient in structuring (1483 columns)
hi @jflournoy and @antaldaniel they might need a public file :/ filesize alone isn't the culprit, these huge .sav
files all import without issue
timss_spss <- xml2::read_html( "https://timssandpirls.bc.edu/timss2015/international-database/" )
spss_links <- grep( "SPSSData" , rvest::html_attr( rvest::html_nodes(timss_spss,'a') , 'href' ) ,value = TRUE )
big_zips <-
c(
paste0( "https://timssandpirls.bc.edu/timss2015/international-database/" , spss_links ) ,
"http://vs-web-fs-1.oecd.org/pisa/PUF_SPSS_COMBINED_CMB_STU_COG.zip",
"http://vs-web-fs-1.oecd.org/pisa/PUF_SPSS_COMBINED_CMB_STU_QQQ.zip",
"http://vs-web-fs-1.oecd.org/pisa/PUF_SPSS_COMBINED_CMB_STU_COG.zip"
)
tf <- tempfile()
for( this_zip in big_zips ){
download.file(this_zip,tf,mode='wb')
z <- unzip( tf , exdir = tempdir() )
for( this_sav in grep( '\\.sav' , z , value = TRUE , ignore.case = TRUE ) ) x <- haven::read_spss( this_sav )
}
I have exactly the same problem that @antaldaniel Thus, it's now worth than ever as I can't read data neither with haven version 1.1.1 nor 1.1.0. Does anyone found the solution? Thank you for your help
@darkdoudou they probably need you to provide an example file that triggers the error..
Unfortunately, I'm not able to share the original data (which is indeed large), and I don't have SPSS, so I'm not able to provide a minimal working example!
OK, so, after having removed and reinstalled Haven 1.1.0 3 times (!?) and exit R (probably the most important), it now works perfectly.
Having exactly the same issue with read_spss. on an RStudio server.
Hi, I encountered the similar problem (I have a 1.74 GB SPSS file which I cannot share either, I cannot delete any variables or cases either). I tried to install dev version of haven, from website @dicorynia shared: https://mran.microsoft.com/snapshot/2017-11-01/bin/windows/contrib/3.5/
It doesn't work tho, as I got message error: Error: package or namespace load failed for ‘haven’: package ‘haven’ was installed by an R version with different internals; it needs to be reinstalled for use with this R version
@bednarowska
Error: package or namespace load failed for ‘haven’: package ‘haven’ was installed by an R version with different internals; it needs to be reinstalled for use with this R version
This is unrelated. You are either running a version of R < 3.5 and trying to install a version for R > 3.5 or vice versa. Because it's a major version change, with different internals, 3.5 requires that you reinstall all packages. See more: http://blog.revolutionanalytics.com/2018/04/r-350.html
Thanks for coming back. Well, it is related as I tried the same solution as other people mentioned in this thread, and wasn't working for me. I realize there are tremendous changes in R version, but my questions is: Did others who tried to reinstall a dev version of haven had to reinstall all packages as well? I am working on 3.5.1.
Hi @batpigandme Is there an email address (or any other method, such as ftp) I can submit an SPSS file with confidence? I'm also having a similar issue with v2.0 and I was wondering what feature in that data file is causing the error.
Since this issue is closed, you should probably open a new one. From the readxl repo, at least a couple of these should work for SPSS files. The last one is least preferable, since it means that the issue is only reproducible for the individual with the file.
How to provide your own xls/xlsx file? In order of preference:
.xlsx
is a supported file type. You'll need to zip or gzip .xls
so it appears as .zip
or .gz
.This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/
I have the following problem with
read_spss/read_sav
. Code work perfectly for previous version ofhaven
package but after updating I got an an error which indicate a problem with allocating memory.I attach code, the file and other information.
I tested this on two R versions (R MRAN & vanilla 3.4.0 and R 3.3.2 both on macOS Sierra 10.12.1). This problem is may be related only to
sav
files. I got no errors for readingsas7bdat
files.EDIT: I tested on
Windows 7 x64 (build 7601) Service Pack 1
withR 3.3.2
and got the same error.SessionInfo
Also tested on
EDIT: Windows machine