ropensci / visdat

Preliminary Exploratory Visualisation of Data
https://docs.ropensci.org/visdat/
Other
450 stars 47 forks source link

Inconsistent vis_expect Behavior With Empty Strings #123

Closed gfleetwood closed 2 years ago

gfleetwood commented 4 years ago

vis_dat::vis_expect is displaying different sets of results for a test for empty strings against a dataframe with empty strings vs one without them.

In this example I get a chart and labels of True (50%), False (50%), and NA.

dat_test <- tibble::tribble(
  ~x, ~y,
  -1,  "",
  0,  "",
  1,  "",
  NA, NA
)

visdat::vis_expect(dat_test, ~.x == "")

And for this dataframe I get a chart and labels Present (100%) and No Expectations True.

dat_test <- tibble::tribble(
  ~x, ~y,
  -1,  "A",
  0,  "A",
  1,  "A",
  NA, NA
)

visdat::vis_expect(dat_test, ~.x == "")

The first results are clear and the second confusing. For one Present implies the condition is True. If the second result displayed True (0%), False (100%), and NA that would be consistent and easily understandable.

njtierney commented 4 years ago

Hi there, thanks for posting this issue! Much appreciated.

I've added some images to clarify/confirm what we are both seeing:

library(visdat)

df_blank <- tibble::tribble(
  ~x, ~y,
  -1,  "",
  0,  "",
  1,  "",
  NA, NA
)

visdat::vis_expect(df_blank, ~.x == "")


df_text <- tibble::tribble(
  ~x, ~y,
  -1,  "A",
  0,  "A",
  1,  "A",
  NA, NA
)

visdat::vis_expect(df_text, ~.x == "")

Created on 2019-09-06 by the reprex package (v0.3.0)

njtierney commented 4 years ago

Just to clarify - the first plot is fine, and the second plot isn't clear - would a legend that said something similar, like:

TRUE (0%) FALSE (100%) NA

be what you would expect? That's what I would expect, but I'd be interested in hearing your thoughts.

gfleetwood commented 4 years ago

Yes, we are seeing the same images, and yes, the legend "TRUE (0%) FALSE (100%) NA" is the one I'd expect.

njtierney commented 4 years ago

OK fantastic, thanks for confirming that! :)

I won't be able to get to this for a while, but now I know what the expected output is I can work it out.

Thank you again for taking the time to report the issue and make a minimal example!

gfleetwood commented 4 years ago

No problem. Thanks for looking into it.