I wanted a variant of tea.R where a more causal interpretation could be offered (there is a kind of causation going on here, as discussed below). The code here is a bit more general (and a bit shorter), as I put the guesses as a vector in one column.
library(tidyverse)
library(utils) # For combn function
n_cups <- 8L
# Can choose any n_cups/2 cups to have "milk first"
# The code in the book assumes (wlog) that these are 1, 2, 3, 4.
milk_first <- sample(n_cups, n_cups/2L)
milk_first
#> [1] 3 6 2 8
n_guesses <- length(milk_first)
guesses <-
tibble(guesses = combn(n_cups, n_guesses, simplify = FALSE)) %>%
rowwise() %>%
mutate(correct = setequal(guesses, milk_first))
# Look at typical guesses. These are vectors.
guesses$guesses[1]
#> [[1]]
#> [1] 1 2 3 4
guesses$guesses[67]
#> [[1]]
#> [1] 4 5 6 8
sum(guesses$correct)
#> [1] 1
p_value <- mean(guesses$correct)
p_value
#> [1] 0.01428571
guesses %>% filter(correct) %>% select(guesses) %>% pull()
#> [[1]]
#> [1] 2 3 6 8
# Now suppose that protocol is changed. For each cup, Fisher tosses a coin:
# if heads, he puts in milk first; if tails, he puts in tea first.
# Muriel Bristol then sips each of the cups and declares either "milk first" or
# "tea first".
#
# With this protocol, it's easier to see a causal interpretation. Label the
# milk-first cases as treatment; tea-first cases as controls. The question is
# whether treatment leads to the outcome of "milk first". One has a sample of
# n_cups independent observations and can use the regular framework. But here,
# we just assume that Bristol guessed all correctly and calculate exact p-value
# for that.
#
# (Note that one can apply a similar causal interpretation to the original
# protocol, but one couldn't apply the standard framework to it;
# no SUTVA for one, as Bristol will declear four and only four cases as
# "tea first".
#
# Alternatively, if Fisher did not tell Bristol that four
# cups are milk-first and four are tea-first, then he could use the same
# standard framework, as the guesses would become independent again under
# the sharp null that putting tea or milk first does not affect Bristol's
# guess.)
milk_first <- which(sample(c(TRUE, FALSE), n_cups, replace=TRUE))
milk_first
#> [1] 2 7
guess_n <- function(x) combn(n_cups, x, simplify = FALSE)
guesses <-
tibble(guesses = unlist(lapply(0:n_cups, guess_n), recursive = FALSE)) %>%
rowwise() %>%
mutate(correct = setequal(guesses, milk_first))
sum(guesses$correct)
#> [1] 1
p_value <- mean(guesses$correct)
p_value
#> [1] 0.00390625
guesses %>% filter(correct) %>% select(guesses) %>% pull()
#> [[1]]
#> [1] 2 7
This is just FYI.
I wanted a variant of
tea.R
where a more causal interpretation could be offered (there is a kind of causation going on here, as discussed below). The code here is a bit more general (and a bit shorter), as I put the guesses as a vector in one column.Created on 2021-02-27 by the reprex package (v1.0.0)