famuvie / breedR

Statistical methods for forest genetic resources analysts
http://famuvie.github.io/breedR/
GNU General Public License v3.0
31 stars 24 forks source link

multivariate analyses #89

Closed leosanrod closed 6 years ago

leosanrod commented 6 years ago

Hi,

No idea whether this is an issue or not. With multiple traits analysis, I can not add more than 6 variables to the model. It seems it is not related to the length of the names, as shortening variables to let say v1, v2, etc does not solve the problem.

This is the model:

additive_model_MV <- list(model = 'add_animal', pedigree = sel_data[,c("self","dad","mum")], id = 'self', var.ini = initial_covs$genetic) reml.multivar.model <- remlf90(fixed = cbind(v1, ... v9) ~ population, genetic = additive_model_MV, data = sel_data, var.ini = list(residual = initial_covs$residual), method = 'em')

With any 6 out of the 9 it works, with more than that it stops.

Thanks for any help.

Leo

famuvie commented 6 years ago

Leo, thanks for reporting. Could not reproduce. It worked fine for this toy example:

ncol <- 9
testdat <- as.data.frame(replicate(ncol, rnorm(1e3)))
names(testdat) <- paste0("y", 1:ncol)

library(breedR)
#> Loading required package: sp
res <- remlf90(
  as.matrix(testdat) ~ 1,
  dat = testdat
)
#> Using default initial variances given by default_initial_variance()
#> See ?breedR.getOption.

so it is not a problem of the number of variables. Any messages?

leosanrod commented 6 years ago

Thanks Facu! OK, it seems to work now. I was compiling the response variables by subsetting the desired columns (resp_var) in the data frame and then converting the result as matrix before getting the bunch into the model: y_resp <- as.matrix(sel_data[,resp_var]) ... which worked for any combination of a maximum of 6 traits. Adding the extra intermediate step of as.data.frame of the subset made things work for any number of traits up to the maximum in my dataset of 9. Somehow column names are not completely passed over when names are many. Leo

famuvie commented 6 years ago

Cool. Although I still don't see where the problem was. If sel_data was originally a data.frame, then subsetting for more than one variable returns another data.frame. Thus, I don't see why adding as.data.frame would make any difference. Maybe if sel_data was a tibble... don't know. In any case, glad it works.