Rdatatable / data.table

R's data.table package extends data.frame:
http://r-datatable.com
Mozilla Public License 2.0
3.57k stars 974 forks source link

fread fails to detect numeric #2123

Closed MichaelChirico closed 7 years ago

MichaelChirico commented 7 years ago

Trying to read this csv:

URL = paste0('https://raw.githubusercontent.com/fivethirtyeight/',
             'data/master/drug-use-by-age/drug-use-by-age.csv')
x = fread(URL, na.strings = "-")
x$`cocaine-frequency`
#  [1] "5.0"  "1.0"  "5.5"  "4.0"  "7.0"  "5.0"  "5.0"  "5.5"  "8.0"  "5.0"  "5.0"  "6.0"  "5.0"  "8.0"  "15.0" "36.0" NA  

Why can fread detect integer when there's an na.string present, but not numeric?

sapply(fread('a,b\n1,-\n2,3.1', na.strings = '-'), class)
#           a           b 
#   "integer" "character" 

sapply(fread('a,b\n1,-\n2,3', na.strings = '-'), class)
#         a         b 
# "integer" "integer" 

(also setting dec = '.' doesn't fix this)

franknarf1 commented 7 years ago

Fwiw, on

data.table 1.10.5 IN DEVELOPMENT built 2017-04-04 00:37:37 UTC; travis

I see x[['cocaine-frequency']] as numeric when running your code, so maybe it broke since then.

MichaelChirico commented 7 years ago

Well that narrows it down, I just rebuilt before posting. @mattdowle is there something around this commit that I missed?

MichaelChirico commented 7 years ago

Closed via #2129 thanks @st-pasha