alexsanjoseph / compareDF

R Tool to compare two data.frames
Other
93 stars 17 forks source link

[minor enhancement] suggested rewording for missing column error for clarity #32

Closed D3SL closed 4 years ago

D3SL commented 4 years ago

When trying to compare two dataframes where at least one column is missing the error message given is Error: Can't find column '<Missing column name>' in'.data'`.

In my case I had a column named "Type" in old_df (made by subsetting a parent set by type) that wasn't in some newly produced data, printing the error Error: Can't find column 'Type' in '.data'. I wound up misreading this as referring not to a column named "Type" but to something attribute of the columns themselves (ie their "type").

Tweaking the error message slightly would greatly reduce the potential for misunderstandings, particularly among people who aren't fluent in English:

If possible: Error: Column named '<Missing column name>' is found in 'df1' but not in 'df2'

Or if its too much work to name the frames like that: Error: Column named '<Missing column name>' is missing from one of the chosen objects

alexsanjoseph commented 4 years ago

I think the ^ formatting got screwed because of some github issues.

Can you give a representatie example?

Good suggestions - I can fix this

D3SL commented 4 years ago

Yeah they're not escaping backticks properly and trying to string a bunch of them together to get them to show is irritating. I just replaced them with singlequotes.

Here's a really simple reprex. Just run compare_df on these two and it'll give you the error that the third column from testtibble1 (named "Type") is missing.

testtibble1<-tribble(
  ~colA, ~colB, ~Type,
  "a",   1, foo,
  "b",   2, bar,
  "c",   3, baz
)

testtibble2<-tribble(
  ~colA, ~colB,
  "a",   1,
  "b",   2,
  "c",   3
)
alexsanjoseph commented 4 years ago

Thanks - Will add this for future

alexsanjoseph commented 4 years ago

@D3SL

Can you verify your reprex again? On running, I'm getting a different error:

testtibble1<-tribble(
  ~colA, ~colB, ~Type,
  "a",   1, 'foo',
  "b",   2, 'bar',
  "c",   3, 'baz'
)

testtibble2<-tribble(
  ~colA, ~colB,
  "a",   1,
  "b",   2,
  "c",   3
)

> compare_df(testtibble1, testtibble2, group_col = c("Type"))
---------------------
Error in check_if_comparable(both_tables$df_new, both_tables$df_old, group_col,  : 
  The two data frames have different columns!
alexsanjoseph commented 4 years ago

Closing since I'm unable to reproduce the error. Please reopen if necessary

D3SL commented 4 years ago

You're right, it must be something local. I can't reproduce it on a second machine but I can consistently get it to look like this screenshot on my work computer.