Open rmcminds opened 2 years ago
i imagine one way to do this is to have a single snp matrix, and treat it like the predictor matrix where more than one variance component is estimated, and covariance is calculated after variance estimation. But make simplifying assumption that not every snp gets its own variance, but rather that pre-defined blocks and interactions between blocks get variance components (winding up with the same model as the two_interacting_subsets.stan model). Tough part will be deciding most efficient way to do this, and coding the combinatorics to calculate the number of total variance components and creating indices to assign them in the right places. Might wind up better to do like existing model and pre-calculate crossprods for every subset and combination of subsets?
introducing GEMMA-like sparsity likely requires compromise. One idea might be to fit a non-sparse model, then identify outlier coefficients which seem to be pushing the bounds of their normal priors, and then fit another model that gives just them large-effect estimates
Ideally, would like to extend concept beyond just the host-microbe interaction, and allow for a bit more nuance e.g. you could input a single genome but see if linkage groups were interacting with each other, to detect epistasis.