n-mounier / MRlap

R package to perform two-sample Mendelian Randomisation (MR) analyses using (potentially) overlapping samples
GNU General Public License v2.0
46 stars 11 forks source link

Why is the Uncorrected Result from MRlap Different from the IVW Method? and where is the p value for LDSC results? #20

Open laleoarrow opened 1 month ago

laleoarrow commented 1 month ago

Update: This could be solved by merge the input with hm3 snp list in advance. However, the IVW-MR results in MRlap is slightly different from IVW results from the tsmr pipeline using TwoSampleMR (I did provide the MRlap with the same IV that is used for tsmr by add the flag user_SNPsToKeep = dat_h_mrlap$SNP). Any thought is appreciated!

Original Q: Hi, all! I am trying to use MRlap as a support to my IVW results but Im having problems with the running. Maybe I got somewhere wrongly understood here. Here is my code as instructed. It runs way more slowly than the demo example on a Macbook Pro (2023) M3max with 128g RAM.

 A <- MRlap(exposure = as.data.frame(exposure_data), # a full summary data with 61,965,710 SNPs
             exposure_name = exposure_name,
             outcome = as.data.frame(outcome_data),  # a full summary data with 9,361,375 SNPs
             outcome_name = outcome_name,
             ld = ld_path,
             hm3 = hm3_path)

Should I merge the full summary data with the w_hm3.noMHC.snplist in advance? If so, does that affect the IV selection when performing MRlap based MR analysis? Any thought is appreciated!

Here is my log

> Checking parameters 
The p-value threshold used for selecting MR instruments is: 5e-08 
The distance used for pruning MR instruments is:  500 Kb 
Distance-based pruning will be used for MR instruments 
> Processing exposure (T1D_Chiou) summary statistics... 
# Preparation of the data... 
The data.frame used as input is: "as.data.frame(exposure_data)".  
   SNPID column, ok - CHR column, ok - POS column, ok - ALT column, ok - REF column, ok - BETA column, ok - SE column, ok - N column, ok 
> Processing outcome (Iridocyclitis_FinnGen) summary statistics... 
# Preparation of the data... 
The data.frame used as input is: "as.data.frame(outcome_data)".  
   SNPID column, ok - CHR column, ok - POS column, ok - ALT column, ok - REF column, ok - BETA column, ok - SE column, ok - N column, ok 
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 
<<< Performing cross-trait LDSC >>>  
> Munging exposure data... 
> Munging outcome data... 
> Running cross-trait LDSC... 
  Please consider saving the log files and checking them to ensure that all columns were interpreted correctly and no warnings were issued for any of the summary statistics files
> Cleaning temporary files... 
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> 
<<< Running IVW-MR >>>  
Error in `dplyr::inner_join()`:
! This join would result in more rows than dplyr can handle.
ℹ 1621359391766 rows would be returned. 2147483647 rows is the maximum number allowed.
ℹ Double check your join keys. This error commonly occurs due to a missing join key, or an improperly specified join condition.
Run `rlang::last_trace()` to see where the error occurred.
Called from: signal_abort(cnd, .file)
laleoarrow commented 1 month ago

Hi,

I have managed to reduce the MRlap input and index the IVs using just IVW with the following code, and it runs smoothly now. However, I have a few questions:

1.Why is the $MRcorrection$MRcorrection$observed_effect different from the results obtained by IVW? Since I used the same IVs, shouldn’t the uncorrected effect be the same as the IVW results? 2.Which index stands for the P value for the LDSC rg?

hm3_snp.list <- fread(hm3_path) %>% select(SNP)
iv_snp.list <- dat_h_mrlap %>% select(SNP)
snp.list <- rbind(hm3_snp.list, iv_snp.list) %>% unique()
exposure_data <- exposure_data %>% filter(SNP %in% snp.list$SNP) %>% select(-Neff)
outcome_data <- outcome_data %>% filter(SNP %in% snp.list$SNP) %>% select(-Neff)

# MR-lap
A <- MRlap(exposure = exposure_data,
             exposure_name = exposure_name,
             outcome = outcome_data,
             outcome_name = outcome_name,
             ld = ld_path,
             hm3 = hm3_path,
             do_pruning = FALSE,
             user_SNPsToKeep = dat_h_mrlap$SNP
             )

Thank you for potential assistance!