ajaynadig / bhr

Suite of heritability and genetic correlation estimation tools for exome-sequencing data
MIT License
31 stars 6 forks source link

The question about all heritability and the heritability of different mutation types. #21

Open 24junwei opened 4 months ago

24junwei commented 4 months ago

Dear BHR team, Thank you for your research on burden heritability. I have truly benefited from it, and now I have some questions I'd like to consult you on.

This is my input file, and it actually contains four variants type:LoF (Loss of function), missense mutation, nonsynonymous mutation (The probability predicted by the machine learning model is represented by a value between 0 and 1 for mpc.), nonsynonymous mutation.

      SNP         gene_id   consequence ac_case ac_ctrl      locus   mpc

1:69095:T:G ENSG00000186092 nonsynonymous 0 1 chr1:69095 0.398 1:69144:C:T ENSG00000186092 synonymous 0 1 chr1:69144 NA 1:69149:T:A ENSG00000186092 nonsynonymous 0 18 chr1:69149 0.490 1:69173:A:T ENSG00000186092 nonsynonymous 0 2 chr1:69173 0.278 1:69202:A:G ENSG00000186092 nonsynonymous 0 2 chr1:69202 0.078 1:69217:G:A ENSG00000186092 nonsynonymous 0 1 chr1:69217 0.010

Result: all Burden Heritability: 0.086 (0.0105) #n snp= 6391349 LoF Burden Heritability: 0.3585 (0.0145) #n snp= 372406 other_missense Burden Heritability: 0.385 (0.0108) #n snp= 91370 nonsynonymous_mpc>=0.6 Burden Heritability: 0.1883 (0.0097) #n snp= 482213 nonsynonymous_mpc<0.6 Burden Heritability: 0.0594 (0.0105) #n snp= 3538392 synonymous Burden Heritability: 0.0311 (0.0074) #n snp= 1906968

Our question: Compared to your examples, when we performed burden heritability analysis using our data, the final results showed very high heritability for loF, other missense and nonsynonymous mutation type. Is this reasonable? Could there be mistakes in my code? We sincerely appreciate your contribution and look forward to your response

Our code: `bp_variantlevel_bipex <- bigreadr::fread2("rare.input.xls") data = subset(bp_variantlevel_bipex, select = -SNP) baseline_model <- read.table("ms_baseline_oe5.txt") #from bhr/ reference_files/ms_baseline_oe5.txt

bp_sumstats_all = wrangle_sumstats(data, 1149, # case 177928) #control

bp_sumstats_LoF = wrangle_sumstats(data, 1149, 177928, data$consequence == "LoF")

bp_sumstats_missense = wrangle_sumstats(data, 1149, 177928, data$consequence == "other_missense")

bp_sumstats_nonsynonymous_mpc_big_0.6 = wrangle_sumstats(data[!is.na(data$mpc),], 1149, 177928, data$consequence[!is.na(data$mpc)] %in% c("nonsynonymous") & data$mpc[!is.na(data$mpc)] >= 0.6)

bp_sumstats_nonsynonymous_mpc_small_0.6 = wrangle_sumstats(data[!is.na(data$mpc),], 1149, 177928, data$consequence[!is.na(data$mpc)] %in% c("nonsynonymous") & data$mpc[!is.na(data$mpc)] < 0.6)

bp_sumstats_synonymous = wrangle_sumstats(data, 1149, 177928, data$consequence == "synonymous")

bp_all_bhr <- BHR(trait1_sumstats = bp_sumstats_all, annotations = list(baseline_model), #baseline model including constraint annotations num_blocks = 100, #number of blocks for jackknife mode = "univariate") #run in univariate mode to compute burden h2

bp_LoF_bhr <- BHR(trait1_sumstats = bp_sumstats_LoF, annotations = list(baseline_model), #baseline model including constraint annotations num_blocks = 100, #number of blocks for jackknife mode = "univariate") #run in univariate mode to compute burden h2

bp_missense_bhr <- BHR(trait1_sumstats = bp_sumstats_missense, annotations = list(baseline_model), #baseline model including constraint annotations num_blocks = 100, #number of blocks for jackknife mode = "univariate") #run in univariate mode to compute burden h2

bp_nonsynonymous_bhr_mpc_big_0.6 <- BHR(trait1_sumstats = bp_sumstats_nonsynonymous_mpc_big_0.6, annotations = list(baseline_model), #baseline model including constraint annotations num_blocks = 100, #number of blocks for jackknife mode = "univariate") #run in univariate mode to compute burden h2

bp_nonsynonymous_bhr_mpc_small_0.6 <- BHR(trait1_sumstats = bp_sumstats_nonsynonymous_mpc_small_0.6, annotations = list(baseline_model), #baseline model including constraint annotations num_blocks = 100, #number of blocks for jackknife mode = "univariate") #run in univariate mode to compute burden h2

bp_synonymous_bhr <- BHR(trait1_sumstats = bp_sumstats_synonymous, annotations = list(baseline_model), #baseline model including constraint annotations num_blocks = 100, #number of blocks for jackknife mode = "univariate") #run in univariate mode to compute burden h2

bp_sample_prevalence = 1149/(1149 + 177928)

population_prevalence_bp = 0.00005 #disease prevalence

obs2lia_factor <- function(K, P){ X <- qnorm(K,lower.tail=FALSE) z <- (1/sqrt(2pi))(exp(-(X*2)/2)) factor <- (K(1-K)K(1-K))/(P(1-P)(z**2)) return(factor) }

bp_scalingfactor = obs2lia_factor(population_prevalence_bp,bp_sample_prevalence)

print(paste0("all Burden Heritability: ", round(bp_all_bhr$mixed_model$heritabilities[1,5] bp_scalingfactor, 4), " ", "(", round(bp_all_bhr$mixed_model$heritabilities[2,5] bp_scalingfactor, 4), ")"))

print(paste0("LoF Burden Heritability: ", round(bp_LoF_bhr$mixed_model$heritabilities[1,5] bp_scalingfactor, 4), " ", "(", round(bp_LoF_bhr$mixed_model$heritabilities[2,5] bp_scalingfactor, 4), ")"))

print(paste0("other_missense Burden Heritability: ", round(bp_missense_bhr$mixed_model$heritabilities[1,5] bp_scalingfactor, 4), " ", "(", round(bp_missense_bhr$mixed_model$heritabilities[2,5] bp_scalingfactor, 4), ")"))

print(paste0("nonsynonymous_mpc>=0.6 Burden Heritability: ", round(bp_nonsynonymous_bhr_mpc_big_0.6$mixed_model$heritabilities[1,5] bp_scalingfactor, 4), " ", "(", round(bp_nonsynonymous_bhr_mpc_big_0.6$mixed_model$heritabilities[2,5] bp_scalingfactor, 4), ")"))

print(paste0("nonsynonymous_mpc<0.6 Burden Heritability: ", round(bp_nonsynonymous_bhr_mpc_small_0.6$mixed_model$heritabilities[1,5] bp_scalingfactor, 4), " ", "(", round(bp_nonsynonymous_bhr_mpc_small_0.6$mixed_model$heritabilities[2,5] bp_scalingfactor, 4), ")"))

print(paste0("synonymous Burden Heritability: ", round(bp_synonymous_bhr$mixed_model$heritabilities[1,5] bp_scalingfactor, 4), " ", "(", round(bp_synonymous_bhr$mixed_model$heritabilities[2,5] bp_scalingfactor, 4), ")"))`