WenjianBI / SPAGE

SaddlePoint Approximation implementation of GxE analysis
3 stars 3 forks source link

impute.mu() causes an error when exp.eta1 is Inf, and the "fixed" impute method uses non-doubled MAF #2

Closed shinichinamba closed 4 years ago

shinichinamba commented 4 years ago

Hello, I'm very excited to use SPAGE. This issue contains 2 points regarding to impute.mu() and the "fixed" impute method, respectively.

  1. As my data caused an error, I debugged it and found that impute.mu() might have caused an error if a genotype vector had low variance and correlated to a phenotype vector. A toy example is:

N = 1000 Data.ls = data.simu.null(N = N, nSNP = 10, nCov = 2, maf = 0.3, prev = 0.01) subjectID = paste0("ID",1:N) Phen.mtx = Data.ls$Phen.mtx obj.null = SPAGE_Null_Model(y ~ Cov1 + Cov2, subjectID = subjectID, data = Phen.mtx, out_type = "D") Envn.mtx = as.matrix(Phen.mtx)[,"Cov1",drop=FALSE] Geno.mtx = Data.ls$Geno.mtx Geno.mtx[, 1] <- rep(1, 1000) Geno.mtx[Phen.mtx$y == max(Phen.mtx$y), 1] <- 1 + 0.0001 Geno.mtx[Phen.mtx$y == min(Phen.mtx$y), 1] <- 1 - 0.0001 rownames(Geno.mtx) = rownames(Envn.mtx) = subjectID SPAGE.one.SNP(Geno.mtx[, 1], obj.null, Envn.mtx, impute.method = "bestguess")

In this case, exp.eta1 can be Inf. Therefore, mean(abs(d.eta)) gets NaN.

  1. Additionally, during debugging I accidentally found that MAF, not 2 * MAF, is used in the "fixed" impute method.

Best,
Shinichi Namba

WenjianBI commented 4 years ago

Hi Shinichi,

Thanks for pointing it out. For the second issue, I have revised it to 2*MAF. For the first issue, I have updated the package to avoid this error. If d.eta is NA, then I let GxE p-value be NA. Please refer to https://github.com/WenjianBI/SPAGE/commit/ed8ab83d81958c9fbb1f4dba8cc01a3302ffe7fa for the detailed information.

As you mentioned, this issue happens very rarely. I think the update should be OK. If you have further question, please feel free to let me know.

Thanks, Wenjian