StirlingCodingClub / studyGroup

Gather together a group to skill-share, co-work, and create community
http://StirlingCodingClub.github.io/studyGroup/
Other
2 stars 1 forks source link

Randomise pairwise elements of a matrix #24

Open SamPaplauskas opened 5 years ago

SamPaplauskas commented 5 years ago

Title

Randomise pairwise elements of a matrix

Issue description

I have a matrix of values which are a table of differences.

I want to compare these obs values to a randomly generated null values.

This tests whether my obs values are significant, rather than occur by chance.

I want to keep the identity of the values...and randomise the pairwise comparisons which produced them.

What I have tried

sample function but I am worried that this creates a non-sensical set of null values

Reproduce the problem

create matrix which shows pairwise differences, like my obs data matrix

m.obs <- matrix(round(runif(10), digits = 2), ncol=16, nrow=16) m.obs[lower.tri(m.obs)] <- m.obs[upper.tri(m.obs)] diag(m.obs) <- NA

this m is the same as

m.obs[upper.tri(m.obs)] <- NA

Desired outcome

x <- 1:16

m.null <- m.obs

x.perm <- sample(x, replace = FALSE)

rownames(m.null) <- x.perm colnames(m.null) <- x.perm

m.null

I want this but where the values are then sorted by the randomised row and colnames
...and I did not accomplish this by already trying sort and order functions
bradduthie commented 5 years ago

Hi @SamPaplauskas -- I've recreated the problem with the code below, as you stated above

m.obs                   <- matrix( round( x = runif(16 * 16), digits = 2), 
                                   ncol = 16, nrow = 16 );
m.obs[lower.tri(m.obs)] <- m.obs[upper.tri(m.obs)];
diag(m.obs)             <- NA;
m.obs[upper.tri(m.obs)] <- NA;
x                       <- 1:16;
m.null                  <- m.obs;
x.perm                  <- sample(x, replace = FALSE);
rownames(m.null)        <- x.perm;
colnames(m.null)        <- x.perm;
print(m.null)

I think that the issue with the above is that R names the rows with the rownames and colnames function, but does not actually sort them by row or column. Maybe try something like the below instead after the line defining x.perm.

m.null <- m.null[x.perm, x.perm];
print(m.null);

Now you will see that some things get sorted into the upper triangle. This isn't a problem; you just need to check to see if something is in the upper triangle and swap the indices if so. This can be done with a couple for loops.

for(col in 1:16){
  for(row in 1:16){
    if(col < row & is.na(m.null[row, col]) == TRUE){
        m.null[row, col] <- m.null[col, row];
        m.null[col, row] <- NA;
    }
  }
}
print(m.null)

Verbally, what's going on with the for loop above is as follows. The code is going through each column (outer for loop), and within each column thorugh each row (inner for loop), thereby looking at each individual element of the $16 \times 16$ matrix m.null. If the element is in the lower triangle (col < row) and equals NA (is.na(m.null[row, col]) == TRUE), then the row and columns need to be swapped with the upper triangle (m.null[row, col] <- m.null[col, row];). This should leave you with numbers in only the lower triangle, where row and columns have been swapped from the original m.obs.

I hope I've understood this correctly?

SamPaplauskas commented 5 years ago

Thanks so much - I will try this out and let you know if it does the trick...

:)

jmcvw commented 5 years ago

Here's a slightly different take - it just does the same as Brad's, but I think is different enough that it might be interesting. I just wrapped it into a function.

The function


m_reorder <- function(m) {

  # --------------------------------------------------
  # this bit will set rownames too
  # you can delete it if input matrix will
  # always have names
  x <- ncol(m)
  x.perm <- as.numeric(colnames(m))

  if (!length(x.perm)) {
    x.perm <- sample(x)
    rownames(m) <- colnames(m) <- x.perm
  }
  # --------------------------------------------------

  #### The business part ####

  # reorder by row / col names
  m <- m[order(x.perm), order(x.perm)]

  # find NA in lower triangle
  lower_na    <- which(lower.tri(m) & is.na(m), arr.ind = TRUE)

  # put non-NAs from upper triangle into lower triangle NA spaces
  m[lower_na] <- m[lower_na[, 2:1]]

  # fill the upper triangle with NAs
  m[!lower.tri(m)] <- NA

  m
}

To use it

# matrix size
mdim <- 16

# create matrix
m.null <- matrix(round(runif(mdim^2), 2), mdim)

# run function
set.seed(1) # because the row / colnames are random and set within
m_reorder(m.null)