DidierMurilloF / FielDHub

FielDHub is an R Shiny design of experiments (DOE) app that aids in the creation of traditional, unreplicated, augmented and partially replicated (p-rep) designs applied to agriculture, plant breeding, forestry, animal and biological sciences.
https://didiermurillof.github.io/FielDHub/
Other
39 stars 20 forks source link

order of replicates in partially_replicated #37

Closed GregorDall closed 1 year ago

GregorDall commented 1 year ago

Dear Didier,

I am trying to generate a number of designs using FielDHub::partially_replicated. I would like to specify where in the list of the genotypes a particular replication appears i.e. have the replicated entries at the end or in the middle of the genotype list, but it seems the function orders the genotype IDs by replications and puts the higher replications first. Do you see a possiblity to circumvent this?

Best Gregor

replicated entries first

mydes1 <- partially_replicated(nrows = 6, ncols = 6, repGens = c(6,24), repUnits = c(2,1)) mydes1$dataEntry

replicated entries last

mydes2 <- partially_replicated(nrows = 6, ncols = 6, repGens = c(24,6), repUnits = c(1,2)) mydes2$dataEntry

DidierMurilloF commented 1 year ago

Gregor,

I think the easier solution here is passing a data frame with the entry list by using the argument data. It can be in Excel and read it in R or just creating one in R. When we pass a data frame through the argument data the function partially_replicated() is going to use the exact order as in the data.

For example, let us create a data frame where we want to replicate the first six entries:

# Replicated entries first
data_first <- data.frame(
    ENTRY = 1:30, 
    NAME = paste0("G", 1:30), 
    REPS = c(rep(c(2,1),c(6,24)))
    )

Then, create the randomization by passing the data frame by data = data_first.

library(FielDHub)
mydes1 <- partially_replicated(nrows = 6, ncols = 6,
                               repGens = c(6,24),
                               repUnits = c(2,1), 
                               data = data_first)
mydes1$dataEntry
plot(mydes1)

Now, similar to the first case we want to replicate the last six entries,

# Replicated entries last
data_last <- data.frame(
    ENTRY = 1:30, 
    NAME = paste0("G", 1:30), 
    REPS = c(rep(c(1,2),c(24,6)))
    )

Finally, get the randomization by passing data = data_last,

mydes2 <- partially_replicated(nrows = 6, ncols = 6,
                               repGens = c(24,6),
                               repUnits = c(1,2), 
                               data = data_last)
mydes2$dataEntry
plot(mydes2)

Even though this approach works well, it can be annoying when we want many different reps settings. I will take a look at the entry list code generation, and I will make sure it follows the user instructions on the replication order.

I hope this works for you.

Thanks for using FielDHub

Best, Didier

GregorDall commented 1 year ago

Dear Didier,

this seems to work for me. Do I still have to pass the arguments regGens and repUnits if I supply this information via the data argument? The results seems to be the same if I omit them.

I will go forward with this solution and let you know if any issues come up. Thank you for the quick response.

Best, Gregor

DidierMurilloF commented 1 year ago

Gregor,

If you wrap up that information via data, you no longer need to pass the arguments regGens and repUnits.

library(FielDHub)
# Replicated entries first
data_first <- data.frame(
    ENTRY = 1:30, 
    NAME = paste0("G", 1:30), 
    REPS = c(rep(c(2,1),c(6,24)))
    )
mydes1 <- partially_replicated(nrows = 6, 
                               ncols = 6,
                               data = data_first, 
                               seed = 17)
mydes1$dataEntry
plot(mydes1)

# Replicated entries last
data_last <- data.frame(
    ENTRY = 1:30, 
    NAME = paste0("G", 1:30), 
    REPS = c(rep(c(1,2),c(24,6)))
    )
mydes2 <- partially_replicated(nrows = 6, 
                               ncols = 6,
                               data = data_last, 
                               seed = 35)
mydes2$dataEntry
plot(mydes2)
GregorDall commented 1 year ago

Hi Didier,

this fails to work if I extend to multilocation:

gens <- 30 locs <- 5

p-rep multi location

mydata <- data.frame( LOCATION = rep(c(1:locs),c(rep(gens,locs))), ENTRY = rep(1:gens,locs), NAME = rep(paste0("G", 1:gens),locs),#paste(paste0("G", 1:(gens*locs)), REPS = c(rep(c(2,1,1,1,1),rep(gens/locs,locs)), rep(c(2,1,1,1,1),rep(gens/locs,locs)), rep(c(2,1,1,1,1),rep(gens/locs,locs)), rep(c(2,1,1,1,1),rep(gens/locs,locs)), rep(c(2,1,1,1,1),rep(gens/locs,locs))))

mydes3 <- partially_replicated(nrows = 6, ncols = 6, data = mydata, multiLocationData = TRUE)

Best Gregor

DidierMurilloF commented 1 year ago

I got it. It is a small issue with some input checks. So, I will push a new version to GitHub that works.

DidierMurilloF commented 1 year ago

Dear Didier,

I am trying to generate a number of designs using FielDHub::partially_replicated. I would like to specify where in the list of the genotypes a particular replication appears i.e. have the replicated entries at the end or in the middle of the genotype list, but it seems the function orders the genotype IDs by replications and puts the higher replications first. Do you see a possiblity to circumvent this?

Best Gregor

replicated entries first

mydes1 <- partially_replicated(nrows = 6, ncols = 6, repGens = c(6,24), repUnits = c(2,1)) mydes1$dataEntry

replicated entries last

mydes2 <- partially_replicated(nrows = 6, ncols = 6, repGens = c(24,6), repUnits = c(1,2)) mydes2$dataEntry

Now the function partially_replicated() generates the entry list following the order in the arguments repGens and repUnits

Some examples:

  1. Replicate entries at first (G1-G6)
    library(FielDHub)
    # Replicate entries at first (G1-G6)
    mydes1 <- partially_replicated(nrows = 6, 
                               ncols = 6,
                               repGens = c(6,24),
                               repUnits = c(2,1), 
                               seed = 7)
    mydes1$dataEntry
    plot(mydes1)
  2. Replicate entries in the middle (G19-G24)

    # Replicate entries in the middle (G19-G24)
    mydes2 <- partially_replicated(nrows = 6, 
                               ncols = 6,
                               repGens = c(18,6,6),
                               repUnits = c(1,2,1), 
                               seed = 17)
    mydes2$dataEntry
    plot(mydes2)
  3. Replicate the last six entries (G25-G30)
    # Replicate the last six entries (G25-G30)
    mydes3 <- partially_replicated(nrows = 6, 
                               ncols = 6,
                               repGens = c(24,6),
                               repUnits = c(1,2), 
                               seed = 35)
    mydes3$dataEntry
    plot(mydes3)

While I submit a new version to CRAN, you can install the GitHub version by using the following code:

remotes::install_github("DidierMurilloF/FielDHub")
DidierMurilloF commented 1 year ago

Hi Didier,

this fails to work if I extend to multilocation:

gens <- 30 locs <- 5

p-rep multi location

mydata <- data.frame( LOCATION = rep(c(1:locs),c(rep(gens,locs))), ENTRY = rep(1:gens,locs), NAME = rep(paste0("G", 1:gens),locs),#paste(paste0("G", 1:(gens*locs)), REPS = c(rep(c(2,1,1,1,1),rep(gens/locs,locs)), rep(c(2,1,1,1,1),rep(gens/locs,locs)), rep(c(2,1,1,1,1),rep(gens/locs,locs)), rep(c(2,1,1,1,1),rep(gens/locs,locs)), rep(c(2,1,1,1,1),rep(gens/locs,locs))))

mydes3 <- partially_replicated(nrows = 6, ncols = 6, data = mydata, multiLocationData = TRUE)

Best Gregor

I fixed the issue.

library(FielDHub)
gens <- 30 
locs <- 5
mydata <- data.frame(
    LOCATION = rep(c(1:locs),c(rep(gens,locs))),
    ENTRY = rep(1:gens,locs),
    NAME = rep(paste0("G", 1:gens),locs),
    REPS = c(rep(c(2,1,1,1,1),rep(gens/locs,locs)),
             rep(c(2,1,1,1,1),rep(gens/locs,locs)),
             rep(c(2,1,1,1,1),rep(gens/locs,locs)),
             rep(c(2,1,1,1,1),rep(gens/locs,locs)),
             rep(c(2,1,1,1,1),rep(gens/locs,locs)))
)

mydes3 <- partially_replicated(nrows = 6, 
                               ncols = 6,
                               l = 5,
                               data = mydata,
                               multiLocationData = TRUE, 
                               seed = 7)

print(mydes3)
plot(mydes3)
plot(mydes3, l = 2)

While I submit a new version to CRAN, you can install the GitHub version by using the following code:

remotes::install_github("DidierMurilloF/FielDHub")

Thanks for catching these bugs.

GregorDall commented 1 year ago

Thanks for the quick fix!