vegawidget / virgo

A DSL for layered grammar of interactive graphics in R
https://vegawidget.github.io/virgo/
Other
16 stars 2 forks source link

Missing values #24

Closed earowang closed 3 years ago

earowang commented 3 years ago

Shall we handle missing values as is now? or different?

library(dplyr)
movies <- jsonlite::read_json(
  "https://vega.github.io/vega-editor/app/data/movies.json"
  , simplifyVector = TRUE)
movies <- movies %>%
  mutate(missing = is.na(IMDB_Rating) | is.na(Rotten_Tomatoes_Rating))
movies %>%
  vega(enc(IMDB_Rating, Rotten_Tomatoes_Rating, colour = missing)) %>%
  mark_point() %>%
  config(mark = list(invalid = NULL))
sa-lee commented 3 years ago

The default is to filter, which I think is reasonable. Another problem happens in the json serialisation when we create the spec from vegawdiget. Here's the example from #27, the current serialisation doesn't include the body mass column because it's NA in R:

Screen Shot 2021-01-29 at 11 01 21 am

because it isn't explicit vega gives NaNs for the density estimate for adelie / gentoo. If we change that to an explicit null for adelie it's area pops up

Screen Shot 2021-01-29 at 11 01 04 am
earowang commented 3 years ago

A new argument na.rm = TRUE gives a warning about missing values are present in makr_*().