Closed AMCalejandro closed 2 years ago
Once RajLabMSSM/echofinemap#11 is sorted out this is the log of the COJO run ( Note I am attaching the full log from the finemap_loci() run)
nohup: ignoring input Registered S3 method overwritten by 'GGally': method from +.gg ggplot2 Bioconductor version '3.12' is out-of-date; the current release version '3.14' is available with R version '4.1'; see https://bioconductor.org/install ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ── ✔ ggplot2 3.3.5 ✔ purrr 0.3.4 ✔ tibble 3.1.6 ✔ dplyr 1.0.7 ✔ tidyr 1.2.0 ✔ stringr 1.4.0 ✔ readr 2.1.2 ✔ forcats 0.5.1 ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ── ✖ dplyr::filter() masks stats::filter() ✖ dplyr::lag() masks stats::lag() Attaching package: 'data.table' The following objects are masked from 'package:dplyr': between, first, last The following object is masked from 'package:purrr': transpose [1] "+ CONDA:: Activating conda env 'echoR'" [1] "Checking for tabix installation..." [1] "Checking for bcftools installation..." ) ) ) ))))))}}}}}}}} {{{{{{{{{(((((( ( ( ( MAD1L1 (1 / 1) ) ) ) ))))))}}}}}}}} {{{{{{{{{(((((( ( ( ( [1] "Determining chrom type from file header" [1] "LD:: Standardizing summary statistics subset." [1] "++ Preparing Gene col" [1] "++ Preparing A1,A1 cols" [1] "++ Preparing MAF,Freq cols" [1] "++ Inferring MAF from frequency column..." [1] "++ Preparing N_cases,N_controls cols" [1] "++ Preparing `proportion_cases` col" [1] "++ 'proportion_cases' not included in data subset." [1] "++ Preparing N col" [1] "+ Preparing sample_size (N) column" [1] "++ Using `sample_size = 3572` TRUE" [1] "++ Preparing t-stat col" [1] "+ Calculating t-statistic from Effect and StdErr..." [1] "++ Assigning lead SNP" [1] "++ Ensuring Effect, StdErr, P are numeric" [1] "++ Ensuring 1 SNP per row" [1] "++ Removing extra whitespace" [1] "++ Saving subset ==> /mnt/rreal/RDS/RDS/acarrasco/ANALYSES_WORKSPACE/EARLY_PD/POST_GWAS/ECHOLOCATOR/RESULTS_23.2.2022/mixedmodels_GWAS/axial_TryingCOJO/MAD1L1/MAD1L1_axial_TryingCOJO_subset.tsv.gz" [1] "LD:: Using UK Biobank LD reference panel." [1] "+ UKB LD file name: chr7_1000001_4000001" [1] "+ LD:: Downloading full .gz/.npz UKB files and saving to disk." [1] "+ Overwriting pre-existing file." [1] "+ CONDA:: Identified axel executable in echoR env." [1] "+ Overwriting pre-existing file." [1] "+ CONDA:: Identified axel executable in echoR env." [1] "+ LD:: load_ld() python function input: /mnt/rreal/RDS/RDS/acarrasco/ANALYSES_WORKSPACE/EARLY_PD/POST_GWAS/ECHOLOCATOR/RESULTS_23.2.2022/mixedmodels_GWAS/axial_TryingCOJO/MAD1L1/LD/chr7_1000001_4000001" [1] "+ LD:: Reading LD matrix into memory. This could take some time..." Processed URL: /mnt/rreal/RDS/RDS/acarrasco/ANALYSES_WORKSPACE/EARLY_PD/POST_GWAS/ECHOLOCATOR/RESULTS_23.2.2022/mixedmodels_GWAS/axial_TryingCOJO/MAD1L1/LD/chr7_1000001_4000001 Some other message at the end [1] "+ Full UKB LD matrix: 30690 x 30690" [1] "+ Full UKB LD SNP data.table: 30690 x 5" [1] "+ LD:: Saving LD matrix ==> /mnt/rreal/RDS/RDS/acarrasco/ANALYSES_WORKSPACE/EARLY_PD/POST_GWAS/ECHOLOCATOR/RESULTS_23.2.2022/mixedmodels_GWAS/axial_TryingCOJO/MAD1L1/LD/MAD1L1.UKB_LD.RDS" [1] "5872 x 5872 LD_matrix (sparse)" vvvvv-- SUSIE --vvvvv ✅ All required columns present. ✅ All suggested columns present. vvvvv-- COJO --vvvvv ✅ All required columns present. ✅ All suggested columns present. +++ Multi-finemap:: SUSIE +++ For large R or large XtX, consider installing the Rfast package for better performance. +++ Multi-finemap:: COJO +++ [1] "+ COJO:: conditional analysis -- Conditioning on: rs4721411" ******************************************************************* * Genome-wide Complex Trait Analysis (GCTA) * version 1.93.2 beta Linux * (C) 2010-present, Jian Yang, The University of Queensland * Please report bugs to Jian Yang <jian.yang.qt@gmail.com> ******************************************************************* Analysis started at 15:47:24 UTC on Wed Feb 23 2022. Hostname: kronos Error: --bfile /mnt/rreal/RDS/RDS/acarrasco/ANALYSES_WORKSPACE/EARLY_PD/POST_GWAS/ECHOLOCATOR/RESULTS_23.2.2022/mixedmodels_GWAS/axial_TryingCOJO/MAD1L1/LD/plink.fam not found An error occurs, please check the options or data [1] "+ COJO:: Processing results..." Error in data.table::fread(file.path(cojo_dir, "cojo.jma.cojo")) : File '/mnt/rreal/RDS/RDS/acarrasco/ANALYSES_WORKSPACE/EARLY_PD/POST_GWAS/ECHOLOCATOR/RESULTS_23.2.2022/mixedmodels_GWAS/axial_TryingCOJO/MAD1L1/COJO/cojo.jma.cojo' does not exist or is non-readable. getwd()=='/mnt/rreal/RDS/RDS/acarrasco/ANALYSES_WORKSPACE/EARLY_PD/POST_GWAS/ECHOLOCATOR/SCRIPTS' In addition: Warning message: In susie_func(bhat = subset_DT$Effect, shat = subset_DT$StdErr, : IBSS algorithm did not converge in 100 iterations! Please check consistency between summary statistics and LD matrix. See https://stephenslab.github.io/susieR/articles/susierss_diagnostic.html Error in `[.data.table`(x, r, vars, with = FALSE) : column(s) not found: Conditioned_Effect Fine-mapping complete in: Time difference of 1.5 hours Error in `[.data.table`(x, r, vars, with = FALSE) : column(s) not found: SNP /mnt/rreal/RDS/RDS/acarrasco/ANALYSES_WORKSPACE/EARLY_PD/POST_GWAS/ECHOLOCATOR/RESULTS_23.2.2022/mixedmodels_GWAS/axial_TryingCOJO/MAD1L1/LD/chr7_1000001_4000001.gz /mnt/rreal/RDS/RDS/acarrasco/ANALYSES_WORKSPACE/EARLY_PD/POST_GWAS/ECHOLOCATOR/RESULTS_23.2.2022/mixedmodels_GWAS/axial_TryingCOJO/MAD1L1/LD/chr7_1000001_4000001.npz
And this is the code calling for calling the function
#!usr/bin/Rscript
library(echolocatoR)
library(tidyverse)
library(data.table)
fullSS_path <- "/mnt/rreal/RDS/RDS/acarrasco/ANALYSES_WORKSPACE/EARLY_PD/POST_GWAS/ECHOLOCATOR/for_echolocatoR_axialOutcome_3.tsv"
fullRS_path <- "/mnt/rreal/RDS/RDS/acarrasco/ANALYSES_WORKSPACE/EARLY_PD/POST_GWAS/ECHOLOCATOR/RESULTS_23.2.2022"
top_SNPs = fread("../topSNPs_axialMotorSymptom.txt")
top_SNPs = top_SNPs[6, ]
earlymotor_PD = finemap_loci(top_SNPs = top_SNPs,
loci = top_SNPs$Locus,
dataset_name = "axial_TryingCOJO",
dataset_type = "mixedmodels_GWAS",
force_new_subset = T,
force_new_LD = T,
force_new_finemap = T,
remove_tmps = F,
# SUMMARY STATS ARGUMENTS
fullSS_path = fullSS_path,
results_dir = fullRS_path,
query_by = "tabix",
chrom_col = "CHR", position_col = "POS", snp_col = "#MarkerName",
pval_col = "Pval", effect_col = "Effect", stderr_col = "StdErr",
freq_col = "medianFreq", MAF_col = "calculate",
A1_col = "Allele1",
A2_col = "Allele2",
#N_cases_col = "TotalSampleSize",
#N_controls = 0,
# FILTERING ARGUMENTS
## It's often desirable to use a larger window size
## (e.g. 2Mb which is bp_distance=500000*2),
## but we use a small window here to speed up the process.
bp_distance = 500000*2,
min_MAF = 0.001,
trim_gene_limits = F,
# FINE-MAPPING ARGUMENTS
## General
finemap_methods = c("SUSIE","COJO"),
n_causal = 5,
PP_threshold = .95,
# LD ARGUMENTS
LD_reference = "UKB",
superpopulation = "EUR",
download_method = "axel",
# Additional arguments - My arguments
case_control = F,
nThread = 15,
sample_size = 3572,
#COJO PARAMS
conditioned_snps = "rs4721411",
# PLOT ARGUMENTS
## general
plot.types=c("fancy"),
## Generate multiple plots of different window sizes;
### all SNPs, 4x zoomed-in, and a 50000bp window
plot.zoom = c("all","4x","10x"),
## XGR
# plot.XGR_libnames=c("ENCODE_TFBS_ClusteredV3_CellTypes"),
## Roadmap
plot.Roadmap = F,
plot.Roadmap_query = NULL,
# Nott et al. (2019)
plot.Nott_epigenome = T,
plot.Nott_show_placseq = T,
verbose = F
)
As well as the session info
R version 4.0.3 (2020-10-10) Platform: x86_64-conda-linux-gnu (64-bit) Running under: Ubuntu 20.04.3 LTS Matrix products: default BLAS/LAPACK: /home/acarrasco/.conda/envs/echoR/lib/libopenblasp-r0.3.12.so locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 LC_PAPER=en_GB.UTF-8 [8] LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] data.table_1.14.2 forcats_0.5.1 stringr_1.4.0 dplyr_1.0.7 purrr_0.3.4 readr_2.1.2 tidyr_1.2.0 tibble_3.1.6 ggplot2_3.3.5 tidyverse_1.3.1 echolocatoR_0.1.2 loaded via a namespace (and not attached): [1] readxl_1.3.1 backports_1.4.1 Hmisc_4.5-0 BiocFileCache_1.14.0 plyr_1.8.6 lazyeval_0.2.2 splines_4.0.3 [8] BiocParallel_1.24.1 GenomeInfoDb_1.26.7 digest_0.6.29 ensembldb_2.14.1 htmltools_0.5.2 fansi_1.0.2 magrittr_2.0.2 [15] checkmate_2.0.0 memoise_2.0.1 BSgenome_1.58.0 cluster_2.1.1 tzdb_0.2.0 Biostrings_2.58.0 modelr_0.1.8 [22] matrixStats_0.61.0 ggbio_1.38.0 askpass_1.1 prettyunits_1.1.1 jpeg_0.1-8.1 colorspace_2.0-2 rvest_1.0.2 [29] blob_1.2.2 rappdirs_0.3.3 haven_2.4.3 xfun_0.29 crayon_1.4.2 RCurl_1.98-1.5 jsonlite_1.7.3 [36] graph_1.68.0 Exact_2.1 survival_3.2-11 VariantAnnotation_1.36.0 glue_1.6.1 gtable_0.3.0 zlibbioc_1.36.0 [43] XVector_0.30.0 DelayedArray_0.16.3 BiocGenerics_0.36.1 scales_1.1.1 mvtnorm_1.1-1 DBI_1.1.2 GGally_2.1.1 [50] Rcpp_1.0.8 progress_1.2.2 htmlTable_2.1.0 foreign_0.8-81 bit_4.0.4 proxy_0.4-25 OrganismDbi_1.32.0 [57] Formula_1.2-4 stats4_4.0.3 htmlwidgets_1.5.3 httr_1.4.2 RColorBrewer_1.1-2 ellipsis_0.3.2 pkgconfig_2.0.3 [64] reshape_0.8.8 XML_3.99-0.6 nnet_7.3-15 dbplyr_2.1.1 utf8_1.2.2 tidyselect_1.1.1 rlang_1.0.1 [71] reshape2_1.4.4 AnnotationDbi_1.52.0 munsell_0.5.0 cellranger_1.1.0 tools_4.0.3 cachem_1.0.6 cli_3.1.1 [78] generics_0.1.2 RSQLite_2.2.9 broom_0.7.12 fastmap_1.1.0 fs_1.5.2 knitr_1.37 bit64_4.0.5 [85] AnnotationFilter_1.14.0 rootSolve_1.8.2.1 RBGL_1.66.0 xml2_1.3.3 biomaRt_2.46.3 compiler_4.0.3 rstudioapi_0.13 [92] curl_4.3 png_0.1-7 e1071_1.7-6 reprex_2.0.1 DescTools_0.99.41 stringi_1.7.6 GenomicFeatures_1.42.3 [99] lattice_0.20-41 ProtGenerics_1.22.0 Matrix_1.3-2 vctrs_0.3.8 pillar_1.7.0 lifecycle_1.0.1 BiocManager_1.30.12 [106] bitops_1.0-7 lmom_2.8 rtracklayer_1.50.0 GenomicRanges_1.42.0 R6_2.5.1 latticeExtra_0.6-29 gridExtra_2.3 [113] IRanges_2.24.1 gld_2.6.2 dichromat_2.0-0 boot_1.3-27 MASS_7.3-53.1 assertthat_0.2.1 SummarizedExperiment_1.20.0 [120] openssl_1.4.6 withr_2.4.3 GenomicAlignments_1.26.0 Rsamtools_2.6.0 S4Vectors_0.28.1 GenomeInfoDbData_1.2.4 expm_0.999-6 [127] parallel_4.0.3 hms_1.1.1 grid_4.0.3 rpart_4.1-15 class_7.3-18 MatrixGenerics_1.2.1 biovizBase_1.38.0 [134] Biobase_2.50.0 lubridate_1.8.0 base64enc_0.1-3 r$>
Thanks for letting me know about this! I will look into this along with RajLabMSSM/echofinemap#11. It's possible this relates to version differences in COJO. WHen I wrote this code I was using: gcta_1.92.1beta5_mac
@AMCalejandro I've rewritten and just pushed much of the COJO code to make it easier to run and understand.
Try reinstalling echofinemap
and check out the documentation ?echofinemap::COJO
.
It also automatically installs the GCTA-COJO executable (v1.93.2) via genetics.binaRies
, so you don't have to worry about version differences.
Hi,
I got the following error when trying to run COJO
I was looking at the code but I am not sure in which step "cojo.jma.cojo" file is generated. Does it come from running COJO? In that case, it may be solved with RajLabMSSM/echofinemap#11