tidyverse / vroom

Fast reading of delimited files
https://vroom.r-lib.org
Other
622 stars 60 forks source link

Integer is loaded as double #506

Closed kadyb closed 11 months ago

kadyb commented 1 year ago
library("vroom")

df = data.frame(x = 1:100)
typeof(df$x)
#> [1] "integer"
tmp = tempfile(fileext = ".csv")
write.csv(df, tmp, row.names = FALSE)

df = vroom(tmp, delim = ",")
typeof(df$x)
#> [1] "double"

df = utils::read.csv(tmp)
typeof(df$x)
#> [1] "integer"
DavisVaughan commented 11 months ago

Interestingly I think vroom actually hardcodes its ability to guess types as integers to false https://github.com/tidyverse/vroom/blob/3691c6833006d319b2edca378258333fb9161135/src/collectors.h#L190

jennybc commented 11 months ago

From readr v1.2.0 on, readr has never guessed integer https://readr.tidyverse.org/news/index.html?q=guess#readr-120.

And, I assume, vroom simply never guessed integer.

IIRC, in both readr and vroom, if you want integer, you have to specify it explicitly.

jennybc commented 11 months ago

I'm quite sure this is an intentional design decision that came from hard lessons learned in readr.