Al-Murphy / MungeSumstats

Rapid standardisation and quality control of GWAS or QTL summary statistics
https://doi.org/doi:10.18129/B9.bioc.MungeSumstats
75 stars 16 forks source link

impute_beta appears to be not an argument #135

Closed irisjansen closed 1 year ago

irisjansen commented 1 year ago

Hi! I love this Rpackage, thank you for designing it. I am using MungeSumstats_1.2.4. and I would like to impute beta with n, z and frq, as there is no beta available in my summary statistics. The 'Getting-Started' Manual describes an argument that should make this possible. However, this argument seems to be missing from the format_sumstats function. I think I have the most updated version, so what am I doing wrong?

In addition, the manual describes that (if the impute_beta) argument is present, it calculates with log(OR), or Z x SE. I do not have OR or SE, so does the argument also calculate beta with n, z and freq? And if not, I would just want munge_sumstats to ignore the fact that there is no beta available. I also tried this option with effect_columns_nonzero=TRUE, but then munge_sumstats still will not run. So in the case beta can not be imputed, how to munge sumstats that do not have beta, or and se?

So, the missing impute_beta argument:

args(format_sumstats) function (path, ref_genome = NULL, convert_ref_genome = NULL, convert_small_p = TRUE, compute_z = FALSE, force_new_z = FALSE, compute_n = 0L, convert_n_int = TRUE, analysis_trait = NULL, INFO_filter = 0.9, FRQ_filter = 0, pos_se = TRUE, effect_columns_nonzero = FALSE, N_std = 5, N_dropNA = TRUE, rmv_chr = c("X", "Y", "MT"), rmv_chrPrefix = TRUE, on_ref_genome = TRUE, strand_ambig_filter = FALSE, allele_flip_check = TRUE, allele_flip_drop = TRUE, allele_flip_z = TRUE, allele_flip_frq = TRUE, bi_allelic_filter = TRUE, snp_ids_are_rs_ids = TRUE, remove_multi_rs_snp = FALSE, frq_is_maf = TRUE, sort_coordinates = TRUE, nThread = 1, save_path = tempfile(fileext = ".tsv.gz"), write_vcf = FALSE, tabix_index = FALSE, return_data = FALSE, return_format = "data.table", ldsc_format = FALSE, log_folder_ind = FALSE, log_mungesumstats_msgs = FALSE, log_folder = tempdir(), imputation_ind = FALSE, force_new = FALSE, mapping_file = sumstatsColHeaders)

Thanks a lot in advance! Iris

Al-Murphy commented 1 year ago

Hey! Great to hear you like the package. I believe the issues you are having is caused by the version you are using. I strongly advise you update your R to 4.2 and bioconductor to 3.16 and then install MSS >= 1.6.0. There has been substantial changes to MSS between 1.2 and 1.6 so this should resolve these issues. However, when you do update if your issues aren't resolved, let me know.

Thanks, Alan

irisjansen commented 1 year ago

Great, I will do. Thanks a lot for your fast response.

Does the newest version also impute beta based on n, freq and z?

Al-Murphy commented 1 year ago

Great! You can imputes it in two ways (log odds ratio or z-score by SE) as described below:

#' @param impute_beta Binary, whether BETA should be imputed using other effect
#' data if it isn't present in the sumstats. Note that this imputation is an 
#' approximation (for Z & SE approach) so could have an effect on downstream 
#' analysis. Use with caution. The different methods MungeSumstats will try and 
#' impute beta (in this order or priority) are: 
#' 1. log(OR)  2. Z x SE
Al-Murphy commented 1 year ago

Closing this for now but feel free to reopen if this did not resolve your issue!