McMinds-Lab / analysis_templates

Basic data analysis scripts to be modified for each project
0 stars 4 forks source link

develop atomm-like stan model to accept arbitrary number of interacting subsets #13

Open rmcminds opened 2 years ago

rmcminds commented 2 years ago

Ideally, would like to extend concept beyond just the host-microbe interaction, and allow for a bit more nuance e.g. you could input a single genome but see if linkage groups were interacting with each other, to detect epistasis.

rmcminds commented 2 years ago

https://github.com/McMinds-Lab/analysis_templates/blob/main/gwas/epistatic/two_interacting_subsets_with_self.stan

rmcminds commented 2 years ago

i imagine one way to do this is to have a single snp matrix, and treat it like the predictor matrix where more than one variance component is estimated, and covariance is calculated after variance estimation. But make simplifying assumption that not every snp gets its own variance, but rather that pre-defined blocks and interactions between blocks get variance components (winding up with the same model as the two_interacting_subsets.stan model). Tough part will be deciding most efficient way to do this, and coding the combinatorics to calculate the number of total variance components and creating indices to assign them in the right places. Might wind up better to do like existing model and pre-calculate crossprods for every subset and combination of subsets?

rmcminds commented 2 years ago

http://www.stat.columbia.edu/~gelman/research/unpublished/2011.04829.pdf

rmcminds commented 2 years ago

introducing GEMMA-like sparsity likely requires compromise. One idea might be to fit a non-sparse model, then identify outlier coefficients which seem to be pushing the bounds of their normal priors, and then fit another model that gives just them large-effect estimates