Open mklechan opened 6 years ago
Hmm. Seems like a way to blow up file sizes unnecessarily? Is my first reaction...
On Oct 21, 2017 12:31 AM, "Mark Chan" notifications@github.com wrote:
exports to csv where:
a 1.3 1 1 1 1
For h2o-3 parsing inter-operability, we should have 1.0 not 1. Could we add decimal notation to all values in these columns.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Rdatatable/data.table/issues/2434, or mute the thread https://github.com/notifications/unsubscribe-auth/AHQQdV26L58gp6dY7V7ORLBJN5Dx_LEXks5suMrIgaJpZM4QA6kx .
Yes we sacrifice the file size, in return for better parsing and type detection of numeric columns. The parsing in h2o-3 scans the first few lines of a file to determine the type of each column. If the notation is consistent throughout the column, it will lead to less errors.
A column with values like:
DT = data.table(a = c(1.3,1.0,1.0,1.0,1.0))
exports to csv where:
For h2o-3 parsing inter-operability, we should have 1.0 not 1. Could we add decimal notation to all values in these columns.