[BUGZILLA #15971] Inconsistent treatment of character vectors with read.table or read.csv

Created attachment 1657 [details] tiny csv file 1

I attach a tiny .csv file, na1.csv. I created na2.csv by editing out the first column of na1.csv. (I can only attach one file, but I have pasted the contents below.)

na1.csv ==================== a, b, c 1, "b", 1 2, "", 2 , "b", 3 4, , 4 5, "NA", 5 ===========================

na2.csv =================== b, c "b", 1 "", 2 "b", 3 , 4 "NA", 5 ==========================

Here is what I get when I read them into dataframes:

df1 <- read.csv("na1.csv")
df1

a b c 1 1 b 1 2 2 2 3 NA b 3 4 4 4 5 5 NA 5

df2 <- read.csv("na2.csv")
df2

b c

1 b 1 2 2 3 b 3 4 4 5 5

df1$b==df2$b

Error in Ops.factor(df1$b, df2$b) : level sets of factors are different

levels(df1$b)

[1] " " " " " b" " NA"

levels(df2$b)

[1] "" " " "b"

If I read them with as.is=TRUE, I again get the extra spaces in df1$b. Also, again, df1$b[5] is " NA" rather than NA.

I can't see why this would be "correct" behavior. I apologize if I've missed something here.

Thanks for your great work on R!

Best regards,

Joe Ritter

METADATA

Bug author - Joe Ritter
Creation time - 2014-09-10 21:55:33 UTC
Bugzilla link
Status - NEW
Alias - None
Component - I/O
Version - R 3.1.1
Hardware - x86_64/x64/amd64 (64-bit) Windows 64-bit
Importance - P5 major
Assignee - R-core
URL -
Modification time - 2020-02-28 01:53 UTC

MichaelChirico / r-bugs

[BUGZILLA #15971] Inconsistent treatment of character vectors with read.table or read.csv #5440

METADATA

METADATA