tylermorganwall / skpr

Generates and evaluates D, I, A, Alias, E, T, G, and custom optimal designs. Supports generation and evaluation of mixture and split/split-split/N-split plot designs. Includes parametric and Monte Carlo power evaluation functions. Provides a framework to evaluate power using functions provided in other packages or written by the user.
https://tylermorganwall.github.io/skpr/
GNU General Public License v3.0
115 stars 15 forks source link

Reordering CSV design results in difference in power. #62

Closed comatrion closed 3 years ago

comatrion commented 3 years ago

Reordering a blocked design that has been saved without row names results in a difference in power.

Here I generate a design, specifying to include a Block1 column. I write it out to CSV (without row names), read in and evaluate.

Next, I re-order the design, write to CSV (again without row names), read in and evaluate.

The evaluated power differs.

I was expecting the run order to not matter at all, as long as I kept the Block1 data together with the conditions.

library(skpr)

# Generating candidate set:
candidateset = expand.grid(X1 = seq(-1,1, length.out = 3), 
                           X2 = seq(-1,1, length.out = 3))

# Generating design for hard-to-change factors: 
design_htc = gen_design(candidateset = candidateset, 
                        model = ~X2, 
                        trials = 4)

# Generating design:
design_1 = gen_design(candidateset = candidateset, 
                    model = ~(X1 + X2), 
                    trials = 12, 
                    splitplotdesign = design_htc, 
                    blocksizes = 3,
                    add_blocking_columns=TRUE)

# Write out the design and read back, omitting row names since blocking structure is specified in the Block1 column
write.csv(design_1, "file_1.csv", row.names = FALSE)
design_1 <- read.csv(file = 'file_1.csv', head = TRUE, sep = ',')

# Evaluating Design:
ret_1 = eval_design(design = design_1,
                    model = ~(X1 + X2), 
                    alpha = 0.05, 
                    blocking = TRUE)
print(design_1)
print(ret_1)

# Randomise the run order
set.seed(1)
rows <- sample(nrow(design_1))
design_2 <- design_1[rows, ]

# Write out the design and read back, omitting row names since blocking structure is specified in the Block1 column
write.csv(design_2, "file_2.csv", row.names = FALSE)
design_2 <- read.csv(file = 'file_2.csv', head = TRUE, sep = ',')

# Evaluating Design:
ret_2 = eval_design(design = design_2,
                    model = ~(X1 + X2), 
                    alpha = 0.05, 
                    blocking = TRUE)
print(design_2)
print(ret_2)

output

Loading required package: shiny
   Block1 X2 X1
1       1  1 -1
2       1  1  1
3       1  1  1
4       2 -1 -1
5       2 -1 -1
6       2 -1  1
7       3 -1  1
8       3 -1  1
9       3 -1 -1
10      4  1 -1
11      4  1 -1
12      4  1  1
    parameter            type     power
1 (Intercept)    effect.power 0.1792554
2          X1    effect.power 0.8108324
3          X2    effect.power 0.1792554
4 (Intercept) parameter.power 0.1792554
5          X1 parameter.power 0.8108324
6          X2 parameter.power 0.1792554
============Evaluation Info============
• Alpha = 0.05 • Trials = 12 • Blocked = TRUE
• Evaluating Model = ~X1 + X2
• Anticipated Coefficients = c(1, 1, 1)
• Number of Blocks = Level 1: 4
• Variance Ratios  = Level 1: 1
   Block1 X2 X1
1       3 -1 -1
2       2 -1 -1
3       3 -1  1
4       1  1 -1
5       1  1  1
6       2 -1 -1
7       1  1  1
8       3 -1  1
9       2 -1  1
10      4  1 -1
11      4  1  1
12      4  1 -1
    parameter            type     power
1 (Intercept)    effect.power 0.2433037
2          X1    effect.power 0.3635765
3          X2    effect.power 0.4616802
4 (Intercept) parameter.power 0.2433037
5          X1 parameter.power 0.3635765
6          X2 parameter.power 0.4616802
============Evaluation Info============
• Alpha = 0.05 • Trials = 12 • Blocked = TRUE
• Evaluating Model = ~X1 + X2
• Anticipated Coefficients = c(1, 1, 1)
• Number of Blocks = Level 1: 4
• Variance Ratios  = Level 1: 1
GeorgeMKhoury commented 3 years ago

I can replicate.

Our code assumes that the runs are ordered by block (*), which to me makes sense, because the point of blocking is to indicate runs that are "together", usually by run order.

(*): assigning rownames works properly with any run ordering, but the zlist construction assumes block order.

Easiest way to handle this is in the documentation, to indicate the requirement for run ordering. Or, I suppose we could sort the runmatrix for the user before processing.

comatrion commented 3 years ago

Great, thanks. I also wondered if it could error if the blocks are out of order, wherever that matters and is assumed.

tylermorganwall commented 3 years ago

I've added a pull request that should fix this by sorting the blocks before analyzing the power.

comatrion commented 3 years ago

Thanks!