MUCollective / multiverse

R package for creating explorable multiverse analysis
https://mucollective.github.io/multiverse/
GNU General Public License v3.0
62 stars 5 forks source link

I got excited and drafted something (mostly to see if I could use the API) --- #42

Closed ntaback closed 5 years ago

ntaback commented 5 years ago

I got excited and drafted something (mostly to see if I could use the API) ---

M = multiverse()

inside(M, {
  # using tidyeval to do this with `!!!` though I don't think we should allow it
  # see issue #36
  set.seed(branch(seed, !!!1:100))

  x1 <- rnorm(100)
  x2 <- rnorm(100)
  y <- x1 + x2 + runif(100)

  m <- lm(y ~ x1+ x2)

  intervals <- broom::tidy(m, conf.int = TRUE)
})

execute_multiverse(M)

Each universe now contains (amongst everything else) a three-row table called intervals, each with columns term, estimate, conf.low, and conf.high in it. So if we unnest those tables from the universes we get a long-format table where the seed parameter indexes the universe and we can plot them all:

M %>%
  multiverse_table() %>%
  unnest(map(.results, "intervals"), .drop = FALSE) %>%
  unnest(seed) %>%  # won't be necessary once issue #34 is fixed
  ggplot(aes(x = seed, y = estimate, ymin = conf.low, ymax = conf.high)) +
  geom_pointrange() +
  facet_grid(. ~ term) +
  labs(x = "Universe (i.e. the seed parameter)", y = "Model Parameter Values", title = "Estimates with 95% confidence intervals")

image

Some issues cropped up:

  1. I used a sort of hack with tidy-eval and set.seed to get different universes. I'm not sure this approach is the right one --- I think we shouldn't be tidy-eval-ing the universe code by default since I'm pretty sure it will break other people's tidy-eval code if we do it (see #36). So perhaps we need a syntax for letting people pass options as a list? That would make the tidy-eval usage here unnecessary.

  2. This definitely convinces me that parameter values in the multiverse table should not be list columns, because I had to add an extra unnest call to get this to work that I don't think should need to be there (see #34).

  3. I happened to do this in a way where each universe gets a unique number determined by the seed column, but this vis would have been harder to construct if that weren't the case (and with multiverses with more than one parameter, that generally would not be the case). This suggests it might be useful for us to add something like a ".universe" column to the multiverse table that always has a unique identifier for each universe in it.

Originally posted by @mjskay in https://github.com/MUCollective/multidy/issues/33#issuecomment-513941426

ntaback commented 5 years ago

If I run the code above but forget to load the tidyverse library I get this cryptic error instead of something like Error in XX : "could not find function".

Screen Shot 2019-08-01 at 2 01 55 PM

abhsarma commented 5 years ago

Ah, that's a documentation problem. turns out I had forgotten to import the filter function into the NAMESPACE. I wonder how it has passed travis. I'll fix it in my next commit.

ntaback commented 5 years ago

Weird that it passed travis?

abhsarma commented 5 years ago

Yeah, since it is supposed to run R CMD check which identifies all the missing NAMESPACE dependencies

mjskay commented 5 years ago

Looks like those namespace issues are only reported as NOTEs in the CMD CHECK output (under "Checking package"): https://travis-ci.com/MUCollective/multidy

Travis builds only fail on ERRORs and WARNINGs. As a result you pretty much have to check travis output periodically and clear out NOTEs as well (unless they are things that can wait to be fixed later).