yuanzhongshang / GIFT

GNU General Public License v3.0
17 stars 1 forks source link

load the estimated correlated matrix of gene expressions #6

Closed shengxindaniu closed 5 months ago

shengxindaniu commented 7 months ago

Hello, while exploring this software, I found that a correlation matrix for gene expression levels is required. Is this necessary? Because some public eQTL databases do not provide gene expression levels.

yuanzhongshang commented 7 months ago

Hi,

Thank you for your attention. We have provided the correlation matrix among gene expressions for each chromosome from GEUVADIS data. You can access it here. Below is the code to obtain the approximation estimation from the summary statistics. We have provided an example using the summary statistics from our uploaded GEUVADIS summary statistics, in case you need to use other another dataset.

library(data.table)
#laod the genome-wide association results
chr=5
setwd(paste0("E:/Dropbox (University of Michigan)/GEUVADIS/chr",chr))
gene=c("ENSG00000134480.9","ENSG00000127184.6","ENSG00000145715.9","ENSG00000164180.8")
z=NULL
p=NULL
for(i in 1:length(gene)){
  tmp=fread(paste0(gene[i],".tsv.gz"))
  z=cbind(z,tmp$T)
  p=cbind(p,tmp$P)
}

#extract the index of null SNPs for all genes in one region
idx=which(rowSums(p > 1e-5) == ncol(p))
cor(z[idx,])

Best, Zhongshang