I have used randomisation for my data set however I am unsure how to calculate the correct p-value.

ROWLEY-12 commented 4 years ago

@bradduthie

Issue description

I have collected morphology data from two different plant species (Carpobrotus acinaciformis & C. edulis). I have used randomisation to test whether the measurements between species are different.

I have no prior expectation of which species should be larger/smaller and so wish to complete a two-tailed test (Nulll: Size does not differ between species)

I struggled when I came to the output. Is the method for calculating the p-value the same for one-sided and two-sided tests? When I compare the output from my code it differs from the output of a two-sided t-test of the same data.

a <- read.csv("~/Figsgalore/Morphology/Morphology (1).csv")

View(a)

#orig diff between mean fruit diameter from excel
0.254464 

#Looping with a

iter    <- 9999;    
diff    <- NULL;          
N       <- dim(a)[1];
for(i in 1:iter){   
  a_samp   <- sample(x = a[,1], size = N, replace = FALSE);
  samp_pink  <- which(a_samp == "Pink");
  samp_yellow   <- which(a_samp == "Yellow");
  mn_samp_p   <- mean(a[samp_pink, 2]);
  mn_samp_y   <- mean(a[samp_yellow, 2]);
  diff[i]     <- mn_samp_p - mn_samp_y;
}

View(diff)

hist(diff)

#adding arrow

obs_pink <- which(a[,1] == "Pink");
obs_yellow  <- which(a[,1] == "Yellow");
obs_diff  <- mean(a[obs_yellow,2]) - mean(a[obs_pink,2]); 
arrows(x0 = obs_diff, x1 = obs_diff, y0 = 500, y1 = 10, lwd = 3, length = 0.1);

#ISSUE - the below code refers to a 1 tailed t-test using "greater" as the equivalent in t.test
#        is this resolved by multiplying it by 2? 
#        proof my this idea below with "testers" etc. 
#pvalue

less_or_equal_obs <- sum(diff <= obs_diff) + 1;
total_generated   <- length(diff) + 1;
new_p_value       <- less_or_equal_obs / total_generated;

print(new_p_value)
#[1] 0.2667

#using t.test test 

t.test(a[,2]~a[,1],alternative = "two.sided")
#p-value = 0.5177

t.test(a[,2]~a[,1],alternative = "less")
#p-value = 0.7412

t.test(a[,2]~a[,1],alternative = "greater")
#p-value = 0.2588

bradduthie commented 4 years ago

Hi @ROWLEY-12 -- this looks okay to me overall. You said that you have no prior expectation of which species should be larger or smaller, and correctly intend to apply a two-tailed test. But I think you're missing something from the line assigning less_or_equal_obs.

less_or_equal_obs <- sum(diff <= obs_diff) + 1;

In the above, diff is calculated as the difference of Pink minus Yellow fruit diameters (mn_samp_p - mn_samp_y in the loop), but you have defined the observed difference as Yellow minus Pink (obs_diff <- mean(a[obs_yellow,2]) - mean(a[obs_pink,2])). To keep things consistent, it would be good to make it the same for both the observed and randomised differences (so just switch one of the two around, e.g., diff[i] <- mn_samp_y - mn_samp_p instead of diff[i] <- mn_samp_p - mn_samp_y).

My above suggestion does not technically matter though for a two-tailed test, since you don't care which direction the difference is in (i.e., whether pink or yellow is bigger). But the reason that there is an issue still is that to get the P-value, you have taken the sum of differences less than the observed difference rather than the sum of the absolute value of differences greater than the observed difference (sum(diff <= obs_diff) + 1). I think what you want is below.

less_or_equal_obs <- sum(abs(diff) >= abs(obs_diff)) + 1;

The abs function just returns the absolute value of all elements in diff. Before, you were testing a one-tailed hypothesis: that the difference Pink - Yellow was not greater than expected due to chance (i.e., that Pink is not greater than Yellow). Once you change the code a bit (could change to greater_or_equal_obs to reflect the different hypothesis -- see below), I think the P-values should make a bit more sense and will probably be somewhere around the value of the parametric two-tailed t-test (ca 0.5177).

Note that the P-values won't be identical, nor will they be exactly the same if you run the randomisation test multiple times!

To answer your question regarding p-value calculation more conceptually, here are ways that might help think about all of the one- and two-tailed tests you could do:

Null Hypothesis is that Pink is not greater than Yellow -- one-tailed test: The observed difference (obs_diff) is pink - yellow. What you want to know is if this difference is significantly higher than what you would expect if you were to recalculate pink - yellow each time you randomly shuffled flower colour identities (9999 times). The number of times which the randomised pink - yellow is greater than or equal to the observed difference (add one for the actual observation), divided by the total number of differences that you have (10000 = 9999 + 1) is the probability that pink - yellow is as large as it is in your observation obs_diff if there really is no difference between pink and yellow; i.e., your P-value.
Null Hypothesis is that Yellow is not greater than Pink -- one-tailed test: Repeat everything in 1, but swap 'Pink' and 'Yellow'.
Null Hypothesis is that the difference between Pink and Yellow is not greater than what you would expect by chance -- two-tailed test. The difference here is that we don't care about which is larger, so we only care about the magnitude (i.e., the absolute value) of the difference. Is this absolute difference greater than would be expected by chance. What you then need to do is look at the number of times the absolute value of the randomised pink - yellow (or yellow - pink; since we're taking the absolute value, we'll get the same answer either way) is greater than or equal to the observed absolute value (abs(obs_diff)).

bradduthie commented 4 years ago

Ah, one more thing @ROWLEY-12 -- note that in the notes, the hypothesis was that the mean of alive minus dead birds (alive - dead) was not significantly less than zero (i.e., dead birds aren't bigger than living birds), hence the reason for the code:

less_or_equal_obs <- sum(diff <= obs_diff) + 1;

It's looking at the total number of differences for which the randomised value is less than the observed value. Whereas I think your hypothesis is different (whether the difference between two flower measurements is greater than expected by chance, hence the inequality flip).

ROWLEY-12 commented 4 years ago

@bradduthie Thank you!

StirlingCodingClub / studyGroup

I have used randomisation for my data set however I am unsure how to calculate the correct p-value. #32

Issue description