Open sz-cgt opened 5 years ago
To rephrase the issue: The fread()
help isn't consistent with the function's actual default behaviour.
What the fread()
help says: By default, ",," for columns of all types, including type 'character' is read as NA for consistency.
What fread()
actuallly does: na.strings=getOption("datatable.na.strings","NA")
; the getOption
statement returns its default value (the second argument), i.e. "NA"
. Thus, a ",NA," sequence in a file is read as <NA>
, while a ",," sequence is read as ""
, an empty string - contradicting the description of the default behaviour given in the fread()
help.
Possible solutions:
na.strings=getOption("datatable.na.strings","")
, see #4288. It seems like this pull request doesn't pass a number of checks, so in the meantime one could...
Documentation for
fread()
claims that (see thena.strings
parameter documentation):However, that is not the case.
Created on 2019-03-02 by the reprex package (v0.2.1)
As you can see the files have sequences of delimiters with no characters between them, but
fread()
is not returning them asNA
values. Explicitly settingna.strings = ""
, produces the expected behaviour, but this too violates the documentation, which says this should produce the blank string behaviour instead (third line in the quote above).