Closed ghost closed 10 years ago
Thanks for reporting this. This is bizarre. When I'm running this I get
vmode(out.ff.bad) a b "integer" "double"
Are you running the lastest version of ffbase?
library(devtools) install_github("edwindj/ffbase", subdir="pkg")
I think this is the same issue as issue #19 which was solved
Just doublechecked this and indeed it is issue #19 (ffdfappend issue) which is solved already Maybe we should put the new version on CRAN...
In the new version of ffdfappend we get
x <- as.ffdf(data.frame(a = factor("A", levels = LETTERS), b = 5.53)) vmode(x) a b "integer" "double" x <- ffdfappend(x, data.frame(a = factor("B", levels = LETTERS), b = 5.54)) vmode(x) a b "integer" "double" x <- ffdfappend(x, data.frame(a = factor("C", levels = LETTERS), b = 5.49)) vmode(x) a b "integer" "double"
But in the old version of ffdfappend we got
x <- as.ffdf(data.frame(a = factor("A", levels = LETTERS), b = 5.53)) vmode(x) a b "integer" "double" x <- ffdfappend(x, data.frame(a = factor("B", levels = LETTERS), b = 5.54)) vmode(x) a b "integer" "double" x <- ffdfappend(x, data.frame(a = factor("C", levels = LETTERS), b = 5.49)) vmode(x) a b "integer" "integer"
basically due to an issue in the ff package
@jwijffels Thanks! And we should put a new version in CRAN
Should be on CRAN this evening...
When using
ffdfdply
, with certain kinds of FUN functions, a variable that was originally of vmode double will be coerced to vmode integer and turned into a factor whose levels are character strings of the numerical output.For me, this happens when the input ff is of limited precision (e.g. 3 digits), but the output of the FUN is of higher precision (e.g. 3.1 /3 = 1.33333333333...).
I believe this is due to the call to
ffdfappend
; I have produced this result usingffdfappend
by itself, importing data from a a text file to anffdf
and then appending to an existingffdf
in a for-loop.Here is a simple example, with different variations:
I have found a tedious work around when using
ffdfappend
, by readjusting the significant digits of the existing ffdf and the appended ffdf usingsignif.ff
, but this solution won't work forffdfdply
, and the problem may be more general than just with incompatability of numerical precisions.