Closed colinpmillar closed 4 years ago
I've tried
n <- 1e6
big <- data.frame(x=1:n, y=rnorm(n), z=rpois(n,1))
write.taf(big)
line.endings("big.csv")
but did not get an error. Is there another example that gives an error?
Hmmm - not any more - I can;t seem to replicate it, but something was afoot, because I rewrote the line as:
but now it works just fine!
Still there... very repeatable, but seems to be intermitent - I am pretty sure it is that unix2dos() is trying to open a file that is not quite written yet, not sure how to get round that though.
Platform: x86_64-w64-mingw32/x64 (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
Running .First(), R in interactive mode, check `.First` for details ...
.libPaths() set to:
D:/R/win-library/3.6
D:/Program Files/R/R-3.6.1/library
options: stringsAsFactors=FALSE
checking for new packages ...
all up to date.
Autoloading: remotes::install_github, lattice::xyplot
> setwd("D:\\projects\\git\\ices-taf\\FOs\\2019_CS_FisheriesOverview")
> library(icesTAF)
> clean()
> sourceTAF("data")
[04:03:17] data.R running...
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
Error in file(file, open = "wb") : cannot open the connection
In addition: Warning message:
In file(file, open = "wb") :
cannot open file 'data/catch_dat.csv': Invalid argument
[04:03:33] data.R failed
test file: https://www.dropbox.com/s/znhfusigdfxuf5v/test2.RData?dl=1
loaded this then tried from a fresh R in ~ dir:
> icesTAF::write.taf(catch_dat, quote = TRUE) # OK
> library(icesTAF)
> mkdir("data")
> icesTAF::write.taf(catch_dat, dir = "data", quote = TRUE) # OK
> setwd("D:\\projects\\git\\ices-taf\\FOs\\2019_CS_FisheriesOverview")
> mkdir("data")
> icesTAF::write.taf(catch_dat, dir = "data", quote = TRUE)
Error in file(file, open = "wb") : cannot open the connection
In addition: Warning message:
In file(file, open = "wb") :
cannot open file 'data/catch_dat.csv': Invalid argument
> rmdir("data")
> mkdir("data")
> icesTAF::write.taf(catch_dat, dir = "data", quote = TRUE)
Error in file(file, open = "wb") : cannot open the connection
In addition: Warning message:
In file(file, open = "wb") :
cannot open file 'data/catch_dat.csv': Invalid argument
> traceback()
3: file(file, open = "wb")
2: unix2dos(file)
1: icesTAF::write.taf(catch_dat, dir = "data", quote = TRUE)
very odd!
Still haven't been able to reproduce this in Windows or Linux, but it sounds like the core R function write.csv() can - in some cases - exit before the file has been fully created.
The purpose of calling unix2dos() at the end of write.taf() was to conform to the CSV standard, but the potential problems outweigh the benefits.
Removed unix2dos() call in commit 3fbd4a8.
On a Windows machine, such as the TAF server, the resulting file will have Dos line endings (CRLF) so that's pretty good. On a Linux machine, the resulting files will be slightly smaller and some diff tools detect this as a difference in the output.
Users can always call unix2dos() explicitly if they find it helpful in their analysis.
For large data.frames, the system has not finished writting the file when R tries to open the file connection in unix2dos
https://github.com/ices-tools-prod/icesTAF/blob/15b9e4347ac024a597ba1ea35e96c5054bf7ed4c/R/write.taf.R#L105