RajLabMSSM / echolocatoR

Automated statistical and functional fine-mapping pipeline with extensive API access to datasets.
https://rajlabmssm.github.io/echolocatoR
MIT License
30 stars 11 forks source link

Conditional SNP error #124

Open vermaa1 opened 1 year ago

vermaa1 commented 1 year ago

1. Bug description

finemap_loci() errors out with conditioned_snps must have a length equal to 1 or the number of loci

2. Reproducible example

Code

columnnames<-echodata::construct_colmap(
       munged = FALSE,
       CHR = "chrom",
       POS = "pos",
       SNP = "rsid",
       P = "pval",
       Effect = "beta",
       StdErr = "sebeta",
       Freq = "af",
       MAF = "maf",
       A1 = "ref",
       A2 = "alt",
       N_cases = "num_cases.META",
       N_controls = "num_controls.META",
       proportion_cases = "calculate",
       N = "num_samples.META",
       verbose = TRUE
     )

fullSS_path_file<-"summary.txt.gz"

finemap_loci(
       loci = c("TCF7L2"),
       fullSS_path=fullSS_path_file,
       #data_gene,
       fullSS_genome_build = "hg19",
       results_dir = file.path(tempdir(), "results"),
       dataset_name = "dataset_name",
       dataset_type = "GWAS",
       topSNPs = top_snps_1,
       force_new_subset = TRUE,
       force_new_LD = FALSE,
       force_new_finemap = TRUE,
       finemap_methods = c("ABF", "FINEMAP", "SUSIE"),
       finemap_args = NULL,
       n_causal = 5,
       credset_thresh = 0.95,
       consensus_thresh = 2,
       fillNA = 0,
       conditioned_snps = "auto",
       priors_col = NULL,
       munged = FALSE,
       colmap = columnnames,
       compute_n = "ldsc",
       LD_reference = "1KGphase3",
       LD_genome_build = "hg19",
       leadSNP_LD_block = FALSE,
       superpopulation = "EUR",
       download_method = "axel",
       bp_distance = 5e+05,
       min_POS = NA,
       max_POS = NA,
       min_MAF = NA,
       trim_gene_limits = FALSE,
       max_snps = NULL,
       min_r2 = 0,
       remove_variants = FALSE,
       remove_correlates = FALSE,
       query_by = "tabix",
       case_control = TRUE,
       qtl_suffixes = NULL,

       return_all = TRUE,
       use_tryCatch = TRUE,
       seed = 2022,
       nThread = 1,
       verbose = TRUE)

Console output

┌──────────────────────────────────────────┐                                                                                                                   │·······················································
│                                          │                                                                                                                   │·······················································
│   )))> 🦇 TCF7L2 [locus 1 / 1] 🦇 <(((   │                                                                                                                   │·······················································
│                                          │                                                                                                                   │·······················································
└──────────────────────────────────────────┘                                                                                                                   │·······················································
conditioned_snps must have a length equal to 1 or the number of lociLocus TCF7L2 complete in: 0 min                                                            │·······················································
                                                                                                                                                               │·······················································
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────│·······················································
                                                                                                                                                               │·······················································
── Step 6 ▶▶▶ Postprocess data 🎁 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────│·······················································
                                                                                                                                                               │·······················································
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────│·······················································
Returning results as nested list.                                                                                                                              │·······················································
All loci done in: 0 min                                                                                                                                        │·······················································
$TCF7L2                                                                                                                                                        │·······················································
NULL                                                                                                                                                           │·······················································
                                                                                                                                                               │·······················································
$merged_dat                                                                                                                                                    │·······················································
Null data.table (0 rows and 0 cols)  

Data

Header of lead SNP dataframe

head(top_snp_1)

    Locus   Gene CHR        BP        SNP             P  Effect                                                                                                │·······················································
1: TCF7L2 TCF7L2  10 112994312 rs34872471 9.881313e-24 -0.2316                                                                                                │·······················································
2: TCF7L2 TCF7L2  10 112990398 rs11196182  5.800000e-18  0.1347 

3. Session info

(Add output of the R function utils::sessionInfo() below. This helps us assess version/OS conflicts which could be causing bugs.)

``` > utils::sessionInfo() │··························· R version 4.1.3 (2022-03-10) │··························· Platform: x86_64-conda-linux-gnu (64-bit) │··························· Running under: Red Hat Enterprise Linux 8.6 (Ootpa) │··························· │··························· Matrix products: default │··························· BLAS/LAPACK: /autofs/nccs-svm1_home1/anuragv/.conda/envs/echoverse/lib/libopenblasp-r0.3.21.so │··························· │··························· locale: │··························· [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C │··························· [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 │··························· [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 │··························· [7] LC_PAPER=en_US.UTF-8 LC_NAME=C │··························· [9] LC_ADDRESS=C LC_TELEPHONE=C │··························· [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C │··························· │··························· attached base packages: │··························· [1] stats4 stats graphics grDevices utils datasets methods │··························· [8] base │··························· │··························· other attached packages: │··························· [1] BSgenome.Hsapiens.1000genomes.hs37d5_0.99.1 │··························· [2] SNPlocs.Hsapiens.dbSNP144.GRCh37_0.99.20 │··························· [3] BSgenome_1.62.0 │··························· [4] rtracklayer_1.54.0 │··························· [5] Biostrings_2.62.0 │··························· [6] XVector_0.34.0 │··························· [7] GenomicRanges_1.46.1 │··························· [8] GenomeInfoDb_1.30.1 │······················································· [9] IRanges_2.28.0 │······················································· [10] S4Vectors_0.32.4 │······················································· [11] BiocGenerics_0.40.0 │······················································· [12] echolocatoR_2.0.1 │······················································· [13] data.table_1.14.4 │······················································· [14] forcats_0.5.2 │······················································· [15] stringr_1.4.1 │······················································· [16] dplyr_1.0.10 [31] dbplyr_2.2.1 readxl_1.4.1 [0/1811]│··························· [33] Rgraphviz_2.38.0 igraph_1.3.5 │··························· [35] DBI_1.1.3 htmlwidgets_1.5.4 │··························· [37] reshape_0.8.9 downloadR_0.99.4 │··························· [39] googledrive_2.0.0 ellipsis_0.3.2 │··························· [41] backports_1.4.1 biomaRt_2.50.3 │··························· [43] deldir_1.0-6 MatrixGenerics_1.6.0 │··························· [45] MungeSumstats_1.2.4 vctrs_0.4.2 │··························· [47] Biobase_2.54.0 ensembldb_2.18.4 │··························· [49] cachem_1.0.6 withr_2.5.0 │··························· [51] checkmate_2.1.0 vroom_1.6.0 │··························· [53] GenomicAlignments_1.30.0 prettyunits_1.1.1 │··························· [55] cluster_2.1.4 ape_5.6-2 │··························· [57] dir.expiry_1.2.0 lazyeval_0.2.2 │··························· [59] crayon_1.5.2 basilisk.utils_1.6.0 │··························· [61] crul_1.3 pkgconfig_2.0.3 │··························· [63] nlme_3.1-160 ProtGenerics_1.26.0 │··························· [65] XGR_1.1.8 nnet_7.3-18 │··························· [67] gitcreds_0.1.2 rlang_1.0.6 │··························· [69] lifecycle_1.0.3 filelock_1.0.2 │··························· [71] httpcode_0.3.0 BiocFileCache_2.2.1 │··························· [73] modelr_0.1.9 echotabix_0.99.8 │··························· [75] dichromat_2.0-0.1 cellranger_1.1.0 │··························· [77] coloc_5.1.0.1 matrixStats_0.62.0 │··························· [79] graph_1.72.0 Matrix_1.5-1 │··························· [81] osfr_0.2.9 boot_1.3-28 │··························· [83] reprex_2.0.2 base64enc_0.1-3 │··························· [85] googlesheets4_1.0.1 png_0.1-7 │··························· [87] viridisLite_0.4.1 rjson_0.2.21 │··························· [89] rootSolve_1.8.2.3 bitops_1.0-7 │··························· [91] R.oo_1.25.0 ggnetwork_0.5.10 │··························· [93] blob_1.2.3 mixsqp_0.3-43 │··························· [95] echoplot_0.99.5 dnet_1.1.7 │··························· [97] jpeg_0.1-9 echodata_0.99.14 │··························· [99] scales_1.2.1 memoise_2.0.1 │······················································· [101] magrittr_2.0.3 plyr_1.8.7 │······················································· [103] hexbin_1.28.2 zlibbioc_1.40.0 │······················································· [105] compiler_4.1.3 echoconda_0.99.7 │······················································· [107] BiocIO_1.4.0 RColorBrewer_1.1-3 │······················································· [109] catalogueR_1.0.0 Rsamtools_2.10.0 │······················································· [111] cli_3.4.1 echoannot_0.99.7 │······················································· [113] patchwork_1.1.2 htmlTable_2.4.1 │······················································· [115] Formula_1.2-4 MASS_7.3-58.1 │······················································· [117] tidyselect_1.2.0 stringi_1.7.8 │······················································· [119] yaml_2.3.5 supraHex_1.32.0 │······················································· [121] latticeExtra_0.6-30 ggrepel_0.9.1 │······················································· [123] grid_4.1.3 VariantAnnotation_1.40.0 │······················································· [125] tools_4.1.3 lmom_2.9 │······················································· [127] parallel_4.1.3 rstudioapi_0.14 │······················································· [129] foreign_0.8-83 piggyback_0.1.4 │······················································· [131] gridExtra_2.3 gld_2.6.5 │······················································· [133] digest_0.6.29 snpStats_1.44.0 │······················································· [135] BiocManager_1.30.18 Rcpp_1.0.9 │······················································· [137] broom_1.0.1 OrganismDbi_1.36.0 │······················································· [139] httr_1.4.4 AnnotationDbi_1.56.2 │······················································· [141] RCircos_1.2.2 ggbio_1.42.0 │······················································· [143] biovizBase_1.42.0 colorspace_2.0-3 │······················································· [145] rvest_1.0.3 XML_3.99-0.11 │······················································· [147] fs_1.5.2 reticulate_1.26 │······················································· [149] splines_4.1.3 RBGL_1.70.0 │······················································· [151] expm_0.999-6 gh_1.3.1 │······················································· [153] echofinemap_0.99.3 basilisk_1.6.0 │······················································· [155] Exact_3.2 jsonlite_1.8.2 │······················································· [157] susieR_0.12.27 R6_2.5.1 │······················································· [159] Hmisc_4.7-1 pillar_1.8.1 │······················································· [161] htmltools_0.5.3 glue_1.6.2 │······················································· [163] fastmap_1.1.0 DT_0.25 │······················································· [165] BiocParallel_1.28.3 class_7.3-20 │······················································· [167] mvtnorm_1.1-3 utf8_1.2.2 │······················································· [169] lattice_0.20-45 curl_4.3.3 │······················································· [171] DescTools_0.99.46 zip_2.2.1 │······················································· [173] openxlsx_4.2.5 interp_1.1-3 │······················································· [175] survival_3.4-0 googleAuthR_2.0.0 │······················································· [177] munsell_0.5.0 e1071_1.7-11 │······················································· [179] GenomeInfoDbData_1.2.7 haven_2.5.1 │······················································· [181] reshape2_1.4.4 gtable_0.3.1 ```
vermaa1 commented 1 year ago

Please let me know if you need other details to troubleshoot the issue.

vermaa1 commented 1 year ago

@bschilder - Do you have any recommendations on how to resolve this issue?

bschilder commented 1 year ago

haven't had a chance to look work on this project for a while but will let you know when i do

bschilder commented 1 year ago

Hi @vermaa1, sorry for the delay, finally have some time to look at this now.

It looks like you may have sent just a portion of the console output. Would you mind sending the full output? Actually, I think it's just abbreviated bc it exited early.

bschilder commented 1 year ago

Ok, actually I think I see what's happening here. The conditioned SNPs are being extracted automatically from the topSNPs dataframe, which contains multiple rows for the same locus. However you specify to finemap only one locus (loci = c("TCF7L2") so there's a mismatch between the number of proposed conditioned SNPs and the number of loci actually being analyzed. But in practice, users should be able to specify however many conditioned SNPs they want per Locus.

I've just added some more flexible handling of conditioned SNPs (with warning messages about how conditioned_snps will be used) to avoid this. If you reinstall, and rerun can you let me know if this works for you?

vermaa1 commented 1 year ago

Okay. I will give it a try.

bschilder commented 1 year ago

Did this work for you @vermaa1 ?