cumc / xqtl-protocol

Molecular QTL analysis protocol developed by ADSP Functional Genomics Consortium
https://cumc.github.io/xqtl-protocol/
MIT License
38 stars 42 forks source link

BMIQ is not necessary for Seasame processed methylation data #453

Open yuqimiao opened 1 year ago

yuqimiao commented 1 year ago
BMIQ_path = "/mnt/vast/hpc/csg/ROSMAP_methy_QTL_beta/raw_data/ROSMAP_arrayMethylation_covariates.sesame.methyl.beta.translated.bed_BMIQ.bed.gz"
seasame_path = "/mnt/vast/hpc/csg/xqtl_workflow_testing/methyl/sesame_2/ROSMAP_arrayMethylation_covariates.sesame.methyl.beta.bed.gz"
bmiq = read_delim(BMIQ_path, delim = "\t")
seasame = read_delim(seasame_path, delim = "\t")

lm_r2 = lapply(5:100, function(i){
  summary(lm(bmiq[[i]]~seasame[[i]]))$r.squared
})

> min(unlist(lm_r2))
[1] 0.9963013

The Seaseme preprocess already consider the probe bias, and the before and after BMIQ normalization making no difference. Is it necessary to add BMIQ steps in the meQTL pipeline?

hsun3163 commented 1 year ago

I have also looks at the coef of between the two, basically

-0.025 + 1.06*sesame_result = bmiq 

Therefore, it probably makes no sense to include BMIQ.

Considering Wanding's comment:

The two I would indeed recommend are dye bias correction (also sesame version is better since the high signal end and middle end signal are different) and background subtraction (noob would be a good method). Even BMIQ assume there exist three modes on the methylation level distribution, which might not hold if one has global loss of methylation (more seen in cancer, but still a dangerous assumption for perfectionists). 

Perhaps we don't need to include the BMIQ module

gaow commented 1 year ago

I guess my primary question is still if and why our mQTL is very different from published mQTL using ROSMAP. We can verify this using AD genes before we can formally compute pi1