yuanzhongshang / GIFT

GNU General Public License v3.0
16 stars 1 forks source link

How to translate the FUSION weight data to the required format #2

Closed 1667857557 closed 4 months ago

1667857557 commented 6 months ago

Hello Zhongshang,

Thank you for your excellent tool. We would like to conduct the analyses depicted in the paper using the precalculated weight file in FUSION (top 1 prediction models). Could you provide information on how to translate the FUSION weight data to the required format?

Thanks in advance!

Sincerely, Yu-Feng Huang

yuanzhongshang commented 6 months ago

Hi Yu-Feng,

Thanks for your attention! Two-stage version of GIFT can handle the pre-trained weights and summary statistics as input. Please use the code below to extract the top 1 model from FUSION.

# Step 1: Load the GWAS summary statistics, this dataset must include SNPid (or CHR_POS), ref, alt, ensure the alleles match with that of the reference panels. 
GWAS <- read.table("GWAS_sum.txt", header = TRUE)

# Step 2: Construct a list of weight
# Directory of FUSION weight
setwd("./weight") 
file <- list.files("./weight") 
file <- file[grep(".wgt.RDat", file)]
weightlist <- list()

for (i in 1:length(file)) {
  load(file[i])
  weight <- wgt.matrix[, colnames(wgt.matrix) == "top1"]

  if (length(GWAS$SNPid) > 0) {
    idx <- which(GWAS$SNPid %in% snps$V2)
    weight[which(snps$V5 != GWAS[idx,]$alt)] <- (-1) * weight[which(snps$V5 != GWAS[idx,]$alt)]
  }

  if (length(GWAS$CHR_POS) > 0) {
    idx <- which(GWAS$CHR_POS %in% paste0(snps$V1, "_", snps$V4))
    weight[which(snps$V5 != GWAS[idx,]$alt)] <- (-1) * weight[which(snps$V5 != GWAS[idx,]$alt)]
  }

  weightlist[[i]] <- weight
  names(weightlist)[i] <- gsub(".wgt.RDat", "", file[i])
}

betax <- weightconvert(weightlist)
gene <-  names(weightlist)

Then, you can follow the steps in the tutorial to finish the analysis.

Please let me know if you have any other questions.

Best, Zhongshang

1667857557 commented 6 months ago

Hello Zhongshang,

Thanks for your reply. We ran the above code but encountered an error at this step:

> result<-GIFT_two_stage_summ(betax, betay, se_betay, Sigma, n, gene, in_sample_LD = F)
The numbers of rows in betax, betay, se_betay, and Sigma (rows and columns) are not matched. 

we used the cat("Number of rows in betax:", nrow(betax), "\n") to check the betax and found that its row is different from the others. We noticed that the weightlist generated by using Fusion weight is different from the example dataset. Can you give me some advice about this? image image image

yuanzhongshang commented 6 months ago

Hi Yu-Feng,

Our eQTL data is from GEUVADIS, not download from the website. In addition, our data is a example simulated data not a real data.

Best, Zhongshang