Closed J535D165 closed 7 years ago
Another option would be to transform data directly into numeric/integer. Any preferences?
Thanks for the reply!
I like the idea of storing the data in the way it was received. No changes to the datatypes.
Maybe we can import all columns with read.csv without setting colClasses="character", except those columns that need denormalization. It is something like:
meta.names <- names(meta)
meta.types <- rep("character", times = length(meta.names))
names(meta.types) <- meta.names
read.csv('path_to_data.csv', strip.white = TRUE, colClasses = meta.types, strip.white=TRUE)
The warnings it is generating for non-existing columns can be suppressed.
I don't know what to do with the 'denormalization' columns itself. They remain character typed.
It seems to be a good idea to strip whitespace. See the output below
Maybe add
strip.white=TRUE
to line https://github.com/edwindj/cbsodataR/blob/master/R/get-data.R#L31? Not tested.