edwindj / daff

Diff, patch and merge for data.frames, see http://paulfitz.github.io/daff/
https://edwindj.github.io/daff/
Other
153 stars 18 forks source link

Daff doesn't correctly detect added columns that share the same name #16

Closed gwarnes-mdsol closed 7 years ago

gwarnes-mdsol commented 7 years ago

Example:

> iris2 <- data.frame(iris, iris, iris, check.names = FALSE)
> d <- diff_data(iris, iris2)
> d
Daff Comparison: ‘iris’ vs. ‘iris2’ 
  First 6 and last 6 patch lines:
      !       ---
1    @@   Species
2   +++      <NA>
3   +++      <NA>
4   +++      <NA>
5   +++      <NA>
6   +++      <NA>
... ...       ...
296 --- virginica
297 --- virginica
298 --- virginica
299 --- virginica
300 --- virginica
301 --- virginica

> summary(d)

Data diff:
 Comparison: ‘iris’ vs. ‘iris2’ 
        #        Modified Reordered Deleted Added
Rows    150      0        0         150     150  
Columns 5 --> 15 0        0         5       0    

Proposed solution: data_diff should check for duplicated column names. If found, it should call make.unique and generate a warning.