privefl / bigsnpr

R package for the analysis of massive SNP arrays.
https://privefl.github.io/bigsnpr/
186 stars 44 forks source link

Simulating multiple genetically correlated phenotypes #419

Open fmorgante opened 1 year ago

fmorgante commented 1 year ago

Hi Florian, thank you for a great package! I think it would be very useful if snp_simuPheno could be extended to simulate multiple genetically correlated phenotypes. I guess the framework could be made as flexible as one wants, but a great first step would be to allow for the effects to be drawn from a multivariate normal with mean 0 and a user-provided covariance matrix across traits, and independent residuals across traits.

Do you think this is something you are interested in implementing?

privefl commented 1 year ago

Yes, this is definitely something interesting to have. E.g. see https://github.com/privefl/bigsnpr/blob/master/tmp-tests/test-gen-cor.R#L39-L53.

I guess it becomes a bit tricky if you consider that the two phenotypes can have different causal variants.

fmorgante commented 1 year ago

Yes, absolutely. I guess you could start by imposing the same causal variants across traits, but allowing an arbitrary number of traits. The user would then provide the desired genetic correlation matrix

privefl commented 1 year ago

Yes, that should be doable. Something like providing matrix of genetic correlations + vector of heritabilities might be easier than directly the covariance matrix.

fmorgante commented 1 year ago

That sounds great!

privefl commented 1 year ago

Do you want to try to implement it?

fmorgante commented 1 year ago

I do not have bandwidth at the moment, but I may be able to get to it later in the summer (if you or someone else does not do it before). I suggest we keep this issue open until someone implements this. What do you think?

privefl commented 1 year ago

Same. Sounds like a good idea.