Closed jpiaskowski closed 2 years ago
1.
|should we order this section alphabetically? |
I guess that would be OK. I'm wishing there were some more principled ordering (can we identify clusters within these topics?) but alphabetical is a reasonable fallback.
2.
|for the pedigree models (in which there is considerably more than what is listed) should we reference the ag task view instead? I also would also call this "kinship/pedigree models". |
OK (are there a couple of dominant/core packages here?)
3.
|Can we change "penalised models" to "regularized models" since glmm also can be considered penalised? |
Don't know. Maybe "penalized/regularized"? This doesn't strike me as a likely cause of confusion.
4.
|I'd like to remove mention of MICE and reference the missing data task view. It seems like the focus should be packages whose primary purpose is mixed models. |
OK. The reason I referenced MICE is that it probably is the dominant way of handling missing values in mixed models (i.e. I don't think there are commonly used packages that are specifically geared towards mixed models)
5.
|I don't think the section for "large data sets" is needed unless we want to establish clear criteria for what belongs to that. |
OK. (This is similar to handling missing data, in that it's a fairly common "how do I ... with mixed models?" question.)
6.
|I'd like to rename "longitudinal data" to "repeated measures". It seems to me that many packages have functions for this (too many to list?) so maybe focus on the packages that have more options (e.g. 'nlme') |
OK. I also intended to add something to the 'scope' statement at the top to indicate that the task view did not deal generally with longitudinal models that incorporated latent variables (e.g. packages for Kalman filtering, dynamic linear models, etc.)
7.
|can we made lavaan a core package since that is *thee package* for SEM? |
Fine with me
8.
|I'd like to move lmeNB to "generalized linear models" since it runs a negative binomial for its primary functionality - sound okay? |
Fine with me
Missing data strikes me as a different topic, although agreed that MICE is widely used and valuable. While relevant to model fitting, it's outside the scope. These views are challenging to maintain, so keeping the scope tight will be a benefit in the long run.
Here are the kinship/mlm packages from the 'agriculture' task view:
GWAS (Genome Wide Association Studies)
There are many GWAS packages on Bioconductor.
GWAS can be conducted using a stepwise mixed linear model for multilocus data with r pkg("mlmm.gwas") or r github("Gregor-Mendel-Institute/MultLocMixMod") (use library(mlmm)
to load the package in R). The package r pkg("statgenGWAS") can fit GWAS models using the EMMAX algorithm.
GWAS models for a very large number of SNPs and/or observations can be estimated with r pkg("rMVP") and r github("deruncie/megaLMM"). Functions for conducting GWAS in autotetraploids are provided by r github("jendelman/GWASpoly"), and these functions also work in diploid species. Variable selection for ultra-large dimensional GWAS data sets can be done with r pkg("bravo"), which implements the Bayesian algorithm SVEN, selection of variables with embedded screening.
r github("jendelman/StageWise")
provides functions to conduct a 2-stage GWAS when the phenotypic data are from multiple field trials.
For polyploids, r github("jendelman/polyBreedR") provides convenience functions to facilitate the use of genome-wide markers for breeding autotetraploid species, and its functionality also extends to diploids.
Genomic prediction
r pkg("GSelection")
implements genomic selection integrating additive and non-additive models.r pkg("pedmod")
provides linear modelling functions integrating kinship for categorical traits.r pkg("coxme")
can fit Cox proportional hazards models containing both fixed and random effects with a kinship matrix.r pkg("GSMX")
, multivariate genomic selection, estimates trait heritability and handles overfitting through cross validation.r pkg("TSDFGS")
can estimate the optimal training population size and composition for genomic selection.It's a big list clearly. I'm not sure what constitutes a major package from a mixed model perspective.
I think I'm OK leaving most of these out/referring to the agriculture task view (the GBLUP category + coxme + brms + MCMCglmm seem like the only relevant bits). It's interesting that there isn't a "bioinformatics" view, although I guess most of the interesting stuff in that area is on Bioconductor rather than CRAN.
pez
, phyr
are in this category)There was a plan to have a "Omics" task view but no big progress in this direction so far.
I'll leave this open pending what happens with the Agriculture task view, but otherwise, it sounds like we are in agreement. I am psyched to learn about coxme, which solves some challenges my clients have experienced.
On 2022-08-03 4:11 p.m., Julia Piaskowski wrote:
I'm working on the specialized models section and here are some proposed changes. Please weigh in.
1.
|should we order this section alphabetically? |
I guess that would be OK. I'm wishing there were some more
principled ordering (can we identify clusters within these topics?) but alphabetical is a reasonable fallback.
2.
|for the pedigree models (in which there is considerably more than what is listed) should we reference the ag task view instead? I also would also call this "kinship/pedigree models". |
OK (are there a couple of dominant/core packages here?)
3.
|Can we change "penalised models" to "regularized models" since glmm also can be considered penalised? |
Don't know. Maybe "penalized/regularized"? This doesn't strike me as a likely cause of confusion.
4.
|I'd like to remove mention of MICE and reference the missing data task view. It seems like the focus should be packages whose primary purpose is mixed models. |
OK. The reason I referenced MICE is that it probably is the dominant way of handling missing values in mixed models (i.e. I don't think there are commonly used packages that are specifically geared towards mixed models)
5.
|I don't think the section for "large data sets" is needed unless we want to establish clear criteria for what belongs to that. |
OK. (This is similar to handling missing data, in that it's a fairly common "how do I ... with mixed models?" question.)
6.
|I'd like to rename "longitudinal data" to "repeated measures". It seems to me that many packages have functions for this (too many to list?) so maybe focus on the packages that have more options (e.g. 'nlme') |
OK. I also intended to add something to the 'scope' statement at the top to indicate that the task view did not deal generally with longitudinal models that incorporated latent variables (e.g. packages for Kalman filtering, dynamic linear models, etc.)
7.
|can we made lavaan a core package since that is *thee package* for SEM? |
Fine with me
8.
|I'd like to move lmeNB to "generalized linear models" since it runs a negative binomial for its primary functionality - sound okay? |
Fine with me
there's a few other changes (minor edits, packages to add), but it would easier for you to review those after I add them.
— Reply to this email directly, view it on GitHub https://github.com/bbolker/mixedmodels-taskview/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAATIRSBB4V5EWSB4EFQH23VXLG6BANCNFSM55QDHGGQ. You are receiving this because you are subscribed to this thread.Message ID: @.***>
-- Dr. Benjamin Bolker Professor, Mathematics & Statistics and Biology, McMaster University Director, School of Computational Science and Engineering (Acting) Graduate chair, Mathematics & Statistics
E-mail is sent at my convenience; I don't expect replies outside of working hours.
I'm working on the specialized models section and here are some proposed changes. Please weigh in.
there's a few other changes (minor edits, packages to add), but it would easier for you to review those after I add them.