Closed fmoradi1365 closed 5 years ago
I am not entirely certain what your question is: genetic variant effects use SNP data provided by you or are simulated as independent, bi-allelic variants with defined frequency. Non-causal SNPs would not be included in the genetic variant effects, as they are not causal. Maybe you refer to genetic variants used for computing a genetic relationship matrix that builds the basis for the infinitesimal genetic effect?
Example 2 in this vignette shows the step-by-step simulation procedure and the combined approach via runSimulation
: https://hannahvmeyer.github.io/PhenotypeSimulator/articles/PhenotypeSimulator.html
Please check these examples and if you still have question, specify in more detail what you mean by simulation of non-causal SNPs.
I am closing this now, as this question might have also been clarified in #14 and #16. Feel free to re-open if question remains.
Hi Hannah,
Thanks for the reply and sorry for answering late.
Let me clarify my question: Let's say I am giving a genotype input file composed of 40 SNPs to PhenotypeSimulator and I set 5 of them to be considered as causal SNPs. PhenotypeSimulator generates a Phenotype matrix of N*P, in which N is the number of samples and P is the number of traits(phenotypes). Now I want to use this generated phenotype matrix to do a very popular multivariate association test. This test needs one genotype matrix and one phenotype matrix as inputs and outputs a set of p-values. When I consider those 5 causal SNPs as genotype file and the above-mentioned simulated phenotypes as phenotype file, I expect the majority of the 5 output p-values to be significant but they are not. Similarly, when I consider the remaining 35 noncausal SNPs as genotype file and the above-mentioned simulated phenotypes as phenotype file, the majority of the output p-values are unexpectedly significant. Did I miss understood usage of your package?
Can you provide data were this happens? Are the genotypes you provide independent or correlated? I have shown in the paper that PhenotypeSimulator can use genome-wide data and only simulate causal effects for the the number of selected SNPs and variants in LD. As for the causal SNPs not be significant: how many samples did you simulate? how big was the beta you simulated? I cannot trouble-shoot any of this without more information i.e. at least your R code. Please provide and ideally also the genotype matrix (I will keep it confidential).
Many thanks for willing to help. Our genotypes are independent, and we have simulated 350 samples. We discussed the problem in our group again, and it seems that we have a solution. I will contact you in case we need more help.
Hello,
I am using your PhenotypeSimulator package in my research and have a few questions:
I am interested in simulating phenotypes, based on an external genotype data, for both causal and non-causal SNPs to be able to compute type one error and power of some association tests. I followed the examples in the manual and vignette of the package but I did not understand how those R codes lead to simulating phenotypes separately for causal and non-causal SNPs. Do you mind sharing with me an example of R code that shows how traits of both causal and non-causal SNPs can be simulated based on a given genotype input matrix, if you have any?
Many thanks for your time.