tidyverse / vroom

Fast reading of delimited files
https://vroom.r-lib.org
Other
621 stars 60 forks source link

Integer is loaded as double #506

Closed kadyb closed 1 year ago

kadyb commented 1 year ago
library("vroom")

df = data.frame(x = 1:100)
typeof(df$x)
#> [1] "integer"
tmp = tempfile(fileext = ".csv")
write.csv(df, tmp, row.names = FALSE)

df = vroom(tmp, delim = ",")
typeof(df$x)
#> [1] "double"

df = utils::read.csv(tmp)
typeof(df$x)
#> [1] "integer"
DavisVaughan commented 1 year ago

Interestingly I think vroom actually hardcodes its ability to guess types as integers to false https://github.com/tidyverse/vroom/blob/3691c6833006d319b2edca378258333fb9161135/src/collectors.h#L190

jennybc commented 1 year ago

From readr v1.2.0 on, readr has never guessed integer https://readr.tidyverse.org/news/index.html?q=guess#readr-120.

And, I assume, vroom simply never guessed integer.

IIRC, in both readr and vroom, if you want integer, you have to specify it explicitly.

jennybc commented 1 year ago

I'm quite sure this is an intentional design decision that came from hard lessons learned in readr.