renozao / FAQR

Frequently Asked Questions on R: my personal Ask Just Once system for my friends' R problems...
0 stars 0 forks source link

Merge in R #1

Open renozao opened 11 years ago

renozao commented 11 years ago

I want to merge two data.frames by their row.names. When I do so I get an extra column named "Row.names":

> x
  [,1] [,2] [,3]
a    1    3    5
b    2    4    6
> y
    [,1] [,2] [,3]
aaa   11   13   15
b     12   14   16
> merge(x, y, by=0, all=TRUE)
  Row.names V1.x V2.x V3.x V1.y V2.y V3.y
1         a    1    3    5   NA   NA   NA
2       aaa   NA   NA   NA   11   13   15
3         b    2    4    6   12   14   16

Is there an easy way to make merge keep the row.names as row.names rather than as a new column?

I did this in a naïve way, but I'm sure there a way to do it inside the merge function, but didn't find it on the web..

> new_mat=merge(x, y, by=0, all=TRUE)
> rows = new_mat[,1]
> row.names(new_mat)= rows
> new_mat = new_mat[,2:length(colnames(new_mat))]
> new_mat
    V1.x V2.x V3.x V1.y V2.y V3.y
a      1    3    5   NA   NA   NA
aaa   NA   NA   NA   11   13   15
b      2    4    6   12   14   16

Thanks, Rachelly.

renozao commented 11 years ago

I don't use merge much but I think the extra column is created because data.frame cannot have duplicated row names, and one still want to be able to track the original row name. I guess tha tin your case you expect to have at most one-one relationship between rows in x and y. The only improvement I can see here is to slightly shorten your code is to use negative index to remove the first column instead of passing the index of vector of the columns to keep:

new_mat=merge(x, y, by=0, all=TRUE)
rownames(new_mat) <- as.character(new_mat[,1])
new_mat <- new_mat[,-1]