Al-Murphy / MungeSumstats

Rapid standardisation and quality control of GWAS or QTL summary statistics
https://doi.org/doi:10.18129/B9.bioc.MungeSumstats
75 stars 16 forks source link

save_format error #151

Closed adannasusan closed 1 year ago

adannasusan commented 1 year ago

1. Bug description

Whenever I use save_format='LDSC' in format_sumstats(), it always returns an error.

Console output

Paste console output here (e.g. from R/python/command line)

Error in MungeSumstats::format_sumstats(path = file, ref_genome = "GRCh37", : unused argument (save_format = "LDSC")

Expected behaviour

I expected it to save my sumstats in Ldsc format.

2. Reproducible example

Code

(Please add the steps to reproduce the bug here. See here for an intro to making a reproducible example (i.e. reprex) and why they're important! This will help us to help you much faster.)

# Paste example here

library(tidyverse)
library(rio)
library(MungeSumstats)

#'Prep Summary Statistics using MungeSumstats
#'Decided to prep with LDSC instead, to get the correct format for FUSION. 
#'The save_format="LDSC"option in mungeSumstats was not working#'

inputdir <- "/work/kylab/adanna/dPUFAcPUFA/mergedGEM/combineALL/"
setwd(inputdir)

phenos <- c("w3FA_NMR", "w3FA_NMR_TFAP", "w6FA_NMR", "w6FA_NMR_TFAP",
            "w6_w3_ratio_NMR", "DHA_NMR","DHA_NMR_TFAP", "LA_NMR",
            "LA_NMR_TFAP", "PUFA_NMR", "PUFA_NMR_TFAP", "MUFA_NMR", 
            "MUFA_NMR_TFAP", "PUFA_MUFA_ratio_NMR")

for (i in 1:length(phenos)){
  file <- as_tibble(read.table(paste(inputdir, phenos[i], "fishOilALL.txt", sep = ""), 
                               header = TRUE, stringsAsFactors = FALSE))

  file <- file %>% select(c('RSID', 'CHR', 'POS', 'Effect_Allele', 'Non_Effect_Allele', 
                            'Beta_G.Final_Status', 'robust_SE_Beta_G.Final_Status',
                            'robust_P_Value_Interaction'))
  #file <- file %>% select(-SNPID)
  names(file) <- c("SNP", 'CHR', 'POS', 'Non_Effect_Allele', 'Effect_Allele', 'Beta',
                   'SE', 'P')

  #names(file)[names(file) == 'robust_P_Value_Interaction'] <- 'P' 

  munged_file <- MungeSumstats::format_sumstats(path=file, ref_genome="GRCh37", compute_z=TRUE,
                                                save_path = tempfile(fileext = paste(inputdir, "mungedFiles/", 
                                                                                     phenos[i], ".tsv", sep="")), save_format='LDSC')
}

### Data

(If possible, upload a small sample of your data so that we can reproduce the bug on our end. If that's not possible, please at least include a screenshot of your data and other relevant details.)

## 3. Session info

(Add output of the R function `utils::sessionInfo()` below. This helps us assess version/OS conflicts which could be causing bugs.)

<details>

Paste utils::sessionInfo() output

R version 4.2.1 (2022-06-23) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)

Matrix products: default BLAS/LAPACK: /apps/eb/OpenBLAS/0.3.12-GCC-10.2.0/lib/libopenblas_zenp-r0.3.12.so

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] MungeSumstats_1.4.5 rio_0.5.29 forcats_0.5.1
[4] stringr_1.5.0 dplyr_1.1.2 purrr_1.0.1
[7] readr_2.1.4 tidyr_1.3.0 tibble_3.2.1
[10] ggplot2_3.4.2 tidyverse_1.3.2

loaded via a namespace (and not attached): [1] bitops_1.0-7 matrixStats_0.62.0
[3] fs_1.5.2 bit64_4.0.5
[5] lubridate_1.8.0 filelock_1.0.2
[7] progress_1.2.2 httr_1.4.6
[9] googleAuthR_2.0.1 GenomeInfoDb_1.32.2
[11] tools_4.2.1 backports_1.4.1
[13] utf8_1.2.2 R6_2.5.1
[15] DBI_1.1.3 BiocGenerics_0.42.0
[17] colorspace_2.0-3 withr_2.5.0
[19] prettyunits_1.1.1 tidyselect_1.2.0
[21] bit_4.0.5 curl_5.0.1
[23] compiler_4.2.1 cli_3.6.1
[25] rvest_1.0.3 Biobase_2.56.0
[27] xml2_1.3.3 DelayedArray_0.22.0
[29] rtracklayer_1.56.1 scales_1.2.1
[31] rappdirs_0.3.3 digest_0.6.31
[33] Rsamtools_2.12.0 foreign_0.8-82
[35] R.utils_2.12.0 XVector_0.36.0
[37] pkgconfig_2.0.3 MatrixGenerics_1.8.1
[39] fastmap_1.1.0 dbplyr_2.2.1
[41] BSgenome_1.64.0 rlang_1.1.1
[43] readxl_1.4.0 RSQLite_2.2.15
[45] BiocIO_1.6.0 generics_0.1.3
[47] jsonlite_1.8.5 BiocParallel_1.30.3
[49] zip_2.2.0 R.oo_1.25.0
[51] googlesheets4_1.0.0 VariantAnnotation_1.42.1
[53] RCurl_1.98-1.7 magrittr_2.0.3
[55] GenomeInfoDbData_1.2.8 Matrix_1.4-1
[57] Rcpp_1.0.10 munsell_0.5.0
[59] S4Vectors_0.34.0 fansi_1.0.3
[61] lifecycle_1.0.3 R.methodsS3_1.8.2
[63] stringi_1.7.12 yaml_2.3.7
[65] SummarizedExperiment_1.26.1 zlibbioc_1.42.0
[67] BiocFileCache_2.4.0 blob_1.2.3
[69] grid_4.2.1 parallel_4.2.1
[71] crayon_1.5.2 lattice_0.20-45
[73] Biostrings_2.64.0 haven_2.5.0
[75] GenomicFeatures_1.48.3 KEGGREST_1.36.3
[77] hms_1.1.3 pillar_1.9.0
[79] GenomicRanges_1.48.0 rjson_0.2.21
[81] biomaRt_2.52.0 codetools_0.2-18
[83] stats4_4.2.1 reprex_2.0.1
[85] XML_3.99-0.10 glue_1.6.2
[87] data.table_1.14.8 modelr_0.1.8
[89] png_0.1-7 vctrs_0.6.2
[91] tzdb_0.3.0 cellranger_1.1.0
[93] gtable_0.3.3 assertthat_0.2.1
[95] cachem_1.0.6 openxlsx_4.2.5
[97] broom_1.0.0 restfulr_0.0.15
[99] googledrive_2.0.0 gargle_1.2.0
[101] AnnotationDbi_1.58.0 memoise_2.0.1
[103] GenomicAlignments_1.32.0 IRanges_2.30.0

Al-Murphy commented 1 year ago

Hey, I see you have quite an old version of MSS installed (1.4.5) the current release is v1.8.0, can you please update to at least this version and see if the error persists? Also if it does please attach a dataset that will give the same error so I can work to debug it (the save ldsc option works on our sample data with v1.8.0). Cheers, Alan.

Al-Murphy commented 1 year ago

Closing because of inactivity, feel free to reopen if this doesn't clear things up