RajLabMSSM / echofinemap

echoverse module: Statistical and functional fine-mapping functions.
2 stars 1 forks source link

finemap_loci failing #17

Open AMCalejandro opened 1 year ago

AMCalejandro commented 1 year ago

1. Bug description

Hi I am unable to properly run finemap_loci with a quantitative GWAS.

Few things to highlight so far.

2. Reproducible example

Code

columnsnames = echodata::construct_colmap(munged= FALSE,
                                          CHR = "CHR", POS = "POS", 
                                          SNP = "SNP", P = "P",
                                          Effect = "BETA", StdErr = "SE", 
                                          A1 = "A1", A2 = "A2",
                                          N_cases = "N_CAS", MAF = "FREQ",
                                          tstat = NULL, N_controls = NULL, 
                                          proportion_cases = NULL)

finemap_loci(# GENERAL ARGUMENTS 
                                          topSNPs = topSNPs,
                                          results_dir = fullRS_path,
                                          loci = topSNPs$Locus,
                                          dataset_name = "LID_COX",
                                          dataset_type = "GWAS",  
                                          force_new_subset = TRUE,
                                          force_new_LD = FALSE,
                                          force_new_finemap = TRUE,
                                          remove_tmps = FALSE,

                                          finemap_methods = c("ABF","FINEMAP","SUSIE", "POLYFUN_SUSIE"),

                                          # Munge full sumstats first
                                          munged = FALSE,
                                          colmap = columnsnames,
                                          # SUMMARY STATS ARGUMENTS
                                          fullSS_path = newSS_name,
                                          fullSS_genome_build = "hg19",
                                          query_by ="tabix",

                                          bp_distance = 10000,#500000*2,
                                          min_MAF = 0.001, 
                                          trim_gene_limits = FALSE,

                                          case_control = FALSE,

                                          # FINE-MAPPING ARGUMENTS
                                          ## General
                                          n_causal = 5,
                                          credset_thresh = .95,
                                          consensus_thresh = 2,

                                          # LD ARGUMENTS 
                                          LD_reference = "1KGphase3",#"UKB",
                                          superpopulation = "EUR",
                                          download_method = "axel",
                                          LD_genome_build = "hg19",
                                          leadSNP_LD_block = FALSE,

                                          #### PLotting args ####
                                          plot_types = c("simple"),
                                          show_plot = TRUE,
                                          zoom = "1x",
                                          tx_biotypes = NULL,
                                          nott_epigenome = FALSE,
                                          nott_show_placseq = FALSE,
                                          nott_binwidth = 200,
                                          nott_bigwig_dir = NULL,
                                          xgr_libnames = NULL,
                                          roadmap = FALSE,
                                          roadmap_query = NULL,

                                          #### General args ####
                                          seed = 2022,
                                          nThread = 20,
                                          verbose = TRUE
                                          )

Console output


PolyFun submodule already installed.
┌─────────────────────────────────────────────────┐
│                                                 │
│   )))> 🦇 RP11-240A16.1 [locus 1 / 3] 🦇 <(((   │
│                                                 │
└─────────────────────────────────────────────────┘

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 1 ▶▶▶ Query 🔎 ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+ Query Method: tabix
Constructing GRanges query using min/max ranges within a single chromosome.
query_dat is already a GRanges object. Returning directly.
========= echotabix::convert =========
Converting full summary stats file to tabix format for fast querying.
Inferred format: 'table'
Explicit format: 'table'
Inferring comment_char from tabular header: 'CHR'
Determining chrom type from file header.
Chromosome format: 1
Detecting column delimiter.
Identified column separator: \t
Sorting rows by coordinates via bash.
Searching for header row with grep.
( grep ^'CHR' .../QC_V2.txt; grep
    -v ^'CHR' .../QC_V2.txt | sort
    -k1,1n
    -k2,2n ) > .../file2efc11009c2a_sorted.tsv
Constructing outputs
Using existing bgzipped file: /home/rstudio/echolocatoR/echolocatoR_LID/QC_V2.txt.bgz 
Set force_new=TRUE to override this.
Tabix-indexing file using: Rsamtools
Data successfully converted to bgzip-compressed, tabix-indexed format.
========= echotabix::query =========
query_dat is already a GRanges object. Returning directly.
Inferred format: 'table'
Querying tabular tabix file using: Rsamtools.
Checking query chromosome style is correct.
Chromosome format: 1
Retrieving data.
Converting query results to data.table.
Processing query: 4:32425284-32445284
Adding 'query' column to results.
Retrieved data with 76 rows
Saving query ==> /home/rstudio/echolocatoR/echolocatoR_LID/RESULTS/GWAS/LID_COX/RP11-240A16.1/RP11-240A16.1_LID_COX_subset.tsv.gz
+ Query: 76 SNPs x 10 columns.
Standardizing summary statistics subset.
Standardizing main column names.
++ Preparing A1,A1 cols
++ Preparing MAF,Freq cols.
++ Could not infer MAF.
++ Preparing N_cases,N_controls cols.
++ Preparing proportion_cases col.
++ proportion_cases not included in data subset.
Preparing sample size column (N).
WARNING: Neff column could not be calculated as the columns N_CAS & N_CON were not found in the datset
+ Mapping colnames from MungeSumstats ==> echolocatoR
+ Imputing t-statistic from Effect and StdErr.
+ leadSNP missing. Assigning new one by min p-value.
++ Ensuring Effect,StdErr,P are numeric.
++ Ensuring 1 SNP per row and per genomic coordinate.
++ Removing extra whitespace
+ Standardized query: 76 SNPs x 12 columns.
++ Saving standardized query ==> /home/rstudio/echolocatoR/echolocatoR_LID/RESULTS/GWAS/LID_COX/RP11-240A16.1/RP11-240A16.1_LID_COX_subset.tsv.gz

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 2 ▶▶▶ Extract Linkage Disequilibrium 🔗 ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
LD_reference identified as: 1kg.
Previously computed LD_matrix detected. Importing: /home/rstudio/echolocatoR/echolocatoR_LID/RESULTS/GWAS/LID_COX/RP11-240A16.1/LD/RP11-240A16.1.1KGphase3_LD.RDS
LD_reference identified as: r.
Converting obj to sparseMatrix.
+ FILTER:: Filtering by LD features.

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 3 ▶▶▶ Filter SNPs 🚰 ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
FILTER:: Filtering by SNP features.
+ FILTER:: Post-filtered data: 76 x 12
+ Subsetting LD matrix and dat to common SNPs...
Removing unnamed rows/cols
Replacing NAs with 0
+ LD_matrix = 76 SNPs.
+ dat = 76 SNPs.
+ 76 SNPs in common.
Converting obj to sparseMatrix.

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 4 ▶▶▶ Fine-map 🔊 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Gathering method sources.
Gathering method citations.
Preparing sample size column (N).
WARNING: Neff column could not be calculated as the columns N_CAS & N_CON were not found in the datset
+ Mapping colnames from MungeSumstats ==> echolocatoR
Gathering method sources.
Gathering method citations.
Gathering method sources.
Gathering method citations.
ABF
🚫 Missing required column(s) for ABF [skipping]: N, MAF, proportion_cases
FINEMAP
✅ All required columns present.
⚠ Missing optional column(s) for FINEMAP: MAF, N
SUSIE
✅ All required columns present.
⚠ Missing optional column(s) for SUSIE: N
POLYFUN_SUSIE
✅ All required columns present.
⚠ Missing optional column(s) for POLYFUN_SUSIE: MAF, N
++ Fine-mapping using 3 tool(s): FINEMAP, SUSIE, POLYFUN_SUSIE

+++ Multi-finemap:: FINEMAP +++
Preparing sample size column (N).
WARNING: Neff column could not be calculated as the columns N_CAS & N_CON were not found in the datset
+ Mapping colnames from MungeSumstats ==> echolocatoR
+ Subsetting LD matrix and dat to common SNPs...
Removing unnamed rows/cols
Replacing NAs with 0
+ LD_matrix = 76 SNPs.
+ dat = 76 SNPs.
+ 76 SNPs in common.
Converting obj to sparseMatrix.
Constructing master file.
Optional MAF col missing. Replacing with all '.1's
Constructing data.z file.
Constructing data.ld file.
FINEMAP path: /home/rstudio/.cache/R/echofinemap/FINEMAP/finemap_v1.4.1_x86_64/finemap_v1.4.1_x86_64
Inferred FINEMAP version: 1.4.1
Running FINEMAP.
cd .../RP11-240A16.1 &&
    .../finemap_v1.4.1_x86_64

    --sss

    --in-files .../master

    --log

    --n-threads 20

    --n-causal-snps 5
Error : Master file '/home/rstudio/echolocatoR/echolocatoR_LID/RESULTS/GWAS/LID_COX/RP11-240A16.1/FINEMAP/master' is missing an entry in line 2 column 'n_samples'!

|--------------------------------------|
| Welcome to FINEMAP v1.4.1            |
|                                      |
| (c) 2015-2022 University of Helsinki |
|                                      |
| Help :                               |
| - ./finemap --help                   |
| - www.finemap.me                     |
| - www.christianbenner.com            |
|                                      |
| Contact :                            |
| - finemap@christianbenner.com        |
| - matti.pirinen@helsinki.fi          |
|--------------------------------------|

--------
SETTINGS
--------
- dataset            : all
- corr-config        : 0.95
- n-causal-snps      : 5
- n-configs-top      : 50000
- n-conv-sss         : 100
- n-iter             : 100000
- n-threads          : 20
- prior-k0           : 0
- prior-std          : 0.05 
- prob-conv-sss-tol  : 0.001
- prob-cred-set      : 0.95

+++ Multi-finemap:: SUSIE +++
Loading required namespace: Rfast
Failed with error:  'there is no package called 'Rfast''
Preparing sample size column (N).
WARNING: Neff column could not be calculated as the columns N_CAS & N_CON were not found in the datset
+ Mapping colnames from MungeSumstats ==> echolocatoR
sample_size=NULL: must be valid integer.Locus RP11-240A16.1 complete in: 0.26 min
┌─────────────────────────────────────────┐
│                                         │
│   )))> 🦇 XYLT1 [locus 2 / 3] 🦇 <(((   │
│                                         │
└─────────────────────────────────────────┘

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 1 ▶▶▶ Query 🔎 ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+ Query Method: tabix
Constructing GRanges query using min/max ranges within a single chromosome.
query_dat is already a GRanges object. Returning directly.
========= echotabix::convert =========
Converting full summary stats file to tabix format for fast querying.
Inferred format: 'table'
Explicit format: 'table'
Inferring comment_char from tabular header: 'CHR'
Determining chrom type from file header.
Chromosome format: 1
Detecting column delimiter.
Identified column separator: \t
Sorting rows by coordinates via bash.
Searching for header row with grep.
( grep ^'CHR' .../QC_V2.txt; grep
    -v ^'CHR' .../QC_V2.txt | sort
    -k1,1n
    -k2,2n ) > .../file2efc3ee606a8_sorted.tsv
Constructing outputs
Using existing bgzipped file: /home/rstudio/echolocatoR/echolocatoR_LID/QC_V2.txt.bgz 
Set force_new=TRUE to override this.
Tabix-indexing file using: Rsamtools
Data successfully converted to bgzip-compressed, tabix-indexed format.
========= echotabix::query =========
query_dat is already a GRanges object. Returning directly.
Inferred format: 'table'
Querying tabular tabix file using: Rsamtools.
Checking query chromosome style is correct.
Chromosome format: 1
Retrieving data.
Converting query results to data.table.
Processing query: 16:17034975-17054975
Adding 'query' column to results.
Retrieved data with 82 rows
Saving query ==> /home/rstudio/echolocatoR/echolocatoR_LID/RESULTS/GWAS/LID_COX/XYLT1/XYLT1_LID_COX_subset.tsv.gz
+ Query: 82 SNPs x 10 columns.
Standardizing summary statistics subset.
Standardizing main column names.
++ Preparing A1,A1 cols
++ Preparing MAF,Freq cols.
++ Could not infer MAF.
++ Preparing N_cases,N_controls cols.
++ Preparing proportion_cases col.
++ proportion_cases not included in data subset.
Preparing sample size column (N).
WARNING: Neff column could not be calculated as the columns N_CAS & N_CON were not found in the datset
+ Mapping colnames from MungeSumstats ==> echolocatoR
+ Imputing t-statistic from Effect and StdErr.
+ leadSNP missing. Assigning new one by min p-value.
++ Ensuring Effect,StdErr,P are numeric.
++ Ensuring 1 SNP per row and per genomic coordinate.
++ Removing extra whitespace
+ Standardized query: 80 SNPs x 12 columns.
++ Saving standardized query ==> /home/rstudio/echolocatoR/echolocatoR_LID/RESULTS/GWAS/LID_COX/XYLT1/XYLT1_LID_COX_subset.tsv.gz

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 2 ▶▶▶ Extract Linkage Disequilibrium 🔗 ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
LD_reference identified as: 1kg.
Previously computed LD_matrix detected. Importing: /home/rstudio/echolocatoR/echolocatoR_LID/RESULTS/GWAS/LID_COX/XYLT1/LD/XYLT1.1KGphase3_LD.RDS
LD_reference identified as: r.
Converting obj to sparseMatrix.
+ FILTER:: Filtering by LD features.

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 3 ▶▶▶ Filter SNPs 🚰 ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
FILTER:: Filtering by SNP features.
+ FILTER:: Post-filtered data: 79 x 12
+ Subsetting LD matrix and dat to common SNPs...
Removing unnamed rows/cols
Replacing NAs with 0
+ LD_matrix = 79 SNPs.
+ dat = 79 SNPs.
+ 79 SNPs in common.
Converting obj to sparseMatrix.

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 4 ▶▶▶ Fine-map 🔊 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Gathering method sources.
Gathering method citations.
Preparing sample size column (N).
WARNING: Neff column could not be calculated as the columns N_CAS & N_CON were not found in the datset
+ Mapping colnames from MungeSumstats ==> echolocatoR
Gathering method sources.
Gathering method citations.
Gathering method sources.
Gathering method citations.
ABF
🚫 Missing required column(s) for ABF [skipping]: N, MAF, proportion_cases
FINEMAP
✅ All required columns present.
⚠ Missing optional column(s) for FINEMAP: MAF, N
SUSIE
✅ All required columns present.
⚠ Missing optional column(s) for SUSIE: N
POLYFUN_SUSIE
✅ All required columns present.
⚠ Missing optional column(s) for POLYFUN_SUSIE: MAF, N
++ Fine-mapping using 3 tool(s): FINEMAP, SUSIE, POLYFUN_SUSIE

+++ Multi-finemap:: FINEMAP +++
Preparing sample size column (N).
WARNING: Neff column could not be calculated as the columns N_CAS & N_CON were not found in the datset
+ Mapping colnames from MungeSumstats ==> echolocatoR
+ Subsetting LD matrix and dat to common SNPs...
Removing unnamed rows/cols
Replacing NAs with 0
+ LD_matrix = 79 SNPs.
+ dat = 79 SNPs.
+ 79 SNPs in common.
Converting obj to sparseMatrix.
Constructing master file.
Optional MAF col missing. Replacing with all '.1's
Constructing data.z file.
Constructing data.ld file.
FINEMAP path: /home/rstudio/.cache/R/echofinemap/FINEMAP/finemap_v1.4.1_x86_64/finemap_v1.4.1_x86_64
Inferred FINEMAP version: 1.4.1
Running FINEMAP.
cd .../XYLT1 &&
    .../finemap_v1.4.1_x86_64

    --sss

    --in-files .../master

    --log

    --n-threads 20

    --n-causal-snps 5
Error : Master file '/home/rstudio/echolocatoR/echolocatoR_LID/RESULTS/GWAS/LID_COX/XYLT1/FINEMAP/master' is missing an entry in line 2 column 'n_samples'!

|--------------------------------------|
| Welcome to FINEMAP v1.4.1            |
|                                      |
| (c) 2015-2022 University of Helsinki |
|                                      |
| Help :                               |
| - ./finemap --help                   |
| - www.finemap.me                     |
| - www.christianbenner.com            |
|                                      |
| Contact :                            |
| - finemap@christianbenner.com        |
| - matti.pirinen@helsinki.fi          |
|--------------------------------------|

--------
SETTINGS
--------
- dataset            : all
- corr-config        : 0.95
- n-causal-snps      : 5
- n-configs-top      : 50000
- n-conv-sss         : 100
- n-iter             : 100000
- n-threads          : 20
- prior-k0           : 0
- prior-std          : 0.05 
- prob-conv-sss-tol  : 0.001
- prob-cred-set      : 0.95

+++ Multi-finemap:: SUSIE +++
Loading required namespace: Rfast
Failed with error:  'there is no package called 'Rfast''
In addition: Warning message:
In SUSIE(dat = dat, dataset_type = dataset_type, LD_matrix = LD_matrix,  :
  Install Rfast to speed up susieR even further:
   install.packages('Rfast')
Preparing sample size column (N).
WARNING: Neff column could not be calculated as the columns N_CAS & N_CON were not found in the datset
+ Mapping colnames from MungeSumstats ==> echolocatoR
sample_size=NULL: must be valid integer.Locus XYLT1 complete in: 0.3 min
┌────────────────────────────────────────┐
│                                        │
│   )))> 🦇 LRP8 [locus 3 / 3] 🦇 <(((   │
│                                        │
└────────────────────────────────────────┘

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 1 ▶▶▶ Query 🔎 ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
+ Query Method: tabix
Constructing GRanges query using min/max ranges within a single chromosome.
query_dat is already a GRanges object. Returning directly.
========= echotabix::convert =========
Converting full summary stats file to tabix format for fast querying.
Inferred format: 'table'
Explicit format: 'table'
Inferring comment_char from tabular header: 'CHR'
Determining chrom type from file header.
Chromosome format: 1
Detecting column delimiter.
Identified column separator: \t
Sorting rows by coordinates via bash.
Searching for header row with grep.
( grep ^'CHR' .../QC_V2.txt; grep
    -v ^'CHR' .../QC_V2.txt | sort
    -k1,1n
    -k2,2n ) > .../file2efc33368771_sorted.tsv
Constructing outputs
Using existing bgzipped file: /home/rstudio/echolocatoR/echolocatoR_LID/QC_V2.txt.bgz 
Set force_new=TRUE to override this.
Tabix-indexing file using: Rsamtools
Data successfully converted to bgzip-compressed, tabix-indexed format.
========= echotabix::query =========
query_dat is already a GRanges object. Returning directly.
Inferred format: 'table'
Querying tabular tabix file using: Rsamtools.
Checking query chromosome style is correct.
Chromosome format: 1
Retrieving data.
Converting query results to data.table.
Processing query: 1:53768300-53788300
Adding 'query' column to results.
Retrieved data with 52 rows
Saving query ==> /home/rstudio/echolocatoR/echolocatoR_LID/RESULTS/GWAS/LID_COX/LRP8/LRP8_LID_COX_subset.tsv.gz
+ Query: 52 SNPs x 10 columns.
Standardizing summary statistics subset.
Standardizing main column names.
++ Preparing A1,A1 cols
++ Preparing MAF,Freq cols.
++ Could not infer MAF.
++ Preparing N_cases,N_controls cols.
++ Preparing proportion_cases col.
++ proportion_cases not included in data subset.
Preparing sample size column (N).
WARNING: Neff column could not be calculated as the columns N_CAS & N_CON were not found in the datset
+ Mapping colnames from MungeSumstats ==> echolocatoR
+ Imputing t-statistic from Effect and StdErr.
+ leadSNP missing. Assigning new one by min p-value.
++ Ensuring Effect,StdErr,P are numeric.
++ Ensuring 1 SNP per row and per genomic coordinate.
++ Removing extra whitespace
+ Standardized query: 52 SNPs x 12 columns.
++ Saving standardized query ==> /home/rstudio/echolocatoR/echolocatoR_LID/RESULTS/GWAS/LID_COX/LRP8/LRP8_LID_COX_subset.tsv.gz

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 2 ▶▶▶ Extract Linkage Disequilibrium 🔗 ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
LD_reference identified as: 1kg.
Previously computed LD_matrix detected. Importing: /home/rstudio/echolocatoR/echolocatoR_LID/RESULTS/GWAS/LID_COX/LRP8/LD/LRP8.1KGphase3_LD.RDS
LD_reference identified as: r.
Converting obj to sparseMatrix.
+ FILTER:: Filtering by LD features.

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 3 ▶▶▶ Filter SNPs 🚰 ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
FILTER:: Filtering by SNP features.
+ FILTER:: Post-filtered data: 51 x 12
+ Subsetting LD matrix and dat to common SNPs...
Removing unnamed rows/cols
Replacing NAs with 0
+ LD_matrix = 51 SNPs.
+ dat = 51 SNPs.
+ 51 SNPs in common.
Converting obj to sparseMatrix.

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 4 ▶▶▶ Fine-map 🔊 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Gathering method sources.
Gathering method citations.
Preparing sample size column (N).
WARNING: Neff column could not be calculated as the columns N_CAS & N_CON were not found in the datset
+ Mapping colnames from MungeSumstats ==> echolocatoR
Gathering method sources.
Gathering method citations.
Gathering method sources.
Gathering method citations.
ABF
🚫 Missing required column(s) for ABF [skipping]: N, MAF, proportion_cases
FINEMAP
✅ All required columns present.
⚠ Missing optional column(s) for FINEMAP: MAF, N
SUSIE
✅ All required columns present.
⚠ Missing optional column(s) for SUSIE: N
POLYFUN_SUSIE
✅ All required columns present.
⚠ Missing optional column(s) for POLYFUN_SUSIE: MAF, N
++ Fine-mapping using 3 tool(s): FINEMAP, SUSIE, POLYFUN_SUSIE

+++ Multi-finemap:: FINEMAP +++
Preparing sample size column (N).
WARNING: Neff column could not be calculated as the columns N_CAS & N_CON were not found in the datset
+ Mapping colnames from MungeSumstats ==> echolocatoR
+ Subsetting LD matrix and dat to common SNPs...
Removing unnamed rows/cols
Replacing NAs with 0
+ LD_matrix = 51 SNPs.
+ dat = 51 SNPs.
+ 51 SNPs in common.
Converting obj to sparseMatrix.
Constructing master file.
Optional MAF col missing. Replacing with all '.1's
Constructing data.z file.
Constructing data.ld file.
FINEMAP path: /home/rstudio/.cache/R/echofinemap/FINEMAP/finemap_v1.4.1_x86_64/finemap_v1.4.1_x86_64
Inferred FINEMAP version: 1.4.1
Running FINEMAP.
cd .../LRP8 &&
    .../finemap_v1.4.1_x86_64

    --sss

    --in-files .../master

    --log

    --n-threads 20

    --n-causal-snps 5
Error : Master file '/home/rstudio/echolocatoR/echolocatoR_LID/RESULTS/GWAS/LID_COX/LRP8/FINEMAP/master' is missing an entry in line 2 column 'n_samples'!

|--------------------------------------|
| Welcome to FINEMAP v1.4.1            |
|                                      |
| (c) 2015-2022 University of Helsinki |
|                                      |
| Help :                               |
| - ./finemap --help                   |
| - www.finemap.me                     |
| - www.christianbenner.com            |
|                                      |
| Contact :                            |
| - finemap@christianbenner.com        |
| - matti.pirinen@helsinki.fi          |
|--------------------------------------|

--------
SETTINGS
--------
- dataset            : all
- corr-config        : 0.95
- n-causal-snps      : 5
- n-configs-top      : 50000
- n-conv-sss         : 100
- n-iter             : 100000
- n-threads          : 20
- prior-k0           : 0
- prior-std          : 0.05 
- prob-conv-sss-tol  : 0.001
- prob-cred-set      : 0.95

+++ Multi-finemap:: SUSIE +++
Loading required namespace: Rfast
Failed with error:  'there is no package called 'Rfast''
In addition: Warning message:
In SUSIE(dat = dat, dataset_type = dataset_type, LD_matrix = LD_matrix,  :
  Install Rfast to speed up susieR even further:
   install.packages('Rfast')
Preparing sample size column (N).
WARNING: Neff column could not be calculated as the columns N_CAS & N_CON were not found in the datset
+ Mapping colnames from MungeSumstats ==> echolocatoR
sample_size=NULL: must be valid integer.Locus LRP8 complete in: 0.26 min

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

── Step 6 ▶▶▶ Postprocess data 🎁 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Returning results as nested list.
All loci done in: 0.81 min
$`RP11-240A16.1`
NULL

$XYLT1
NULL

$LRP8
NULL

$merged_dat
Null data.table (0 rows and 0 cols)

Warning message:
In SUSIE(dat = dat, dataset_type = dataset_type, LD_matrix = LD_matrix,  :
  Install Rfast to speed up susieR even further:
   install.packages('Rfast')

Data

> head(data_2)
   CHR     BP         SNP A1 A2   FREQ    BETA     SE      P N_CAS
1:   1 731718  rs58276399  t  c 0.8837 -0.1775 0.1583 0.2621  1297
2:   1 731718 rs142557973  t  c 0.8837 -0.1775 0.1583 0.2621  1297
3:   1 734349 rs141242758  t  c 0.8843 -0.1577 0.1593 0.3223  1297
4:   1 753541   rs2073813  a  g 0.1257  0.0721 0.1177 0.5399  2687
5:   1 766007  rs61768174  a  c 0.9005 -0.2559 0.1642 0.1190  1297
6:   1 769223  rs60320384  c  g 0.8749 -0.0772 0.1178 0.5124  2687

> head(topSNPs)
# A tibble: 3 × 7
  Locus         Gene          CHR        POS SNP                     P  BETA
  <chr>         <chr>         <fct>    <int> <chr>               <dbl> <dbl>
1 RP11-240A16.1 RP11-240A16.1 4     32435284 rs189093213 0.00000000167  1.12
2 XYLT1         XYLT1         16    17044975 rs180924818 0.00000000626 -1.14
3 LRP8          LRP8          1     53778300 rs72673189  0.0000000153   1.02

3. Session info

``` > sessionInfo() R version 4.2.0 (2022-04-22) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.4 LTS Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3 locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base other attached packages: [1] SNPlocs.Hsapiens.dbSNP144.GRCh37_0.99.20 BSgenome_1.65.2 rtracklayer_1.57.0 [4] Biostrings_2.65.3 XVector_0.37.1 GenomicRanges_1.49.1 [7] GenomeInfoDb_1.33.5 IRanges_2.31.2 S4Vectors_0.35.3 [10] BiocGenerics_0.43.1 forcats_0.5.2 stringr_1.4.1 [13] dplyr_1.0.10 purrr_0.3.4 readr_2.1.2 [16] tidyr_1.2.0 tibble_3.1.8 ggplot2_3.3.6 [19] tidyverse_1.3.2 data.table_1.14.2 echolocatoR_2.0.1 loaded via a namespace (and not attached): [1] Hmisc_4.7-1 class_7.3-20 ps_1.7.1 [4] Rsamtools_2.13.4 rprojroot_2.0.3 echotabix_0.99.8 [7] crayon_1.5.1 MASS_7.3-58.1 nlme_3.1-159 [10] backports_1.4.1 reprex_2.0.2 basilisk_1.9.2 [13] rlang_1.0.5 readxl_1.4.1 irlba_2.3.5 [16] nloptr_2.0.3 callr_3.7.2 limma_3.53.6 [19] filelock_1.0.2 proto_1.0.0 BiocParallel_1.31.12 [22] rjson_0.2.21 bit64_4.0.5 glue_1.6.2 [25] mixsqp_0.3-43 parallel_4.2.0 processx_3.7.0 [28] AnnotationDbi_1.59.1 HGNChelper_0.8.1 haven_2.5.1 [31] tidyselect_1.1.2 SummarizedExperiment_1.27.2 coloc_5.1.0 [34] usethis_2.1.6 XML_3.99-0.10 ggpubr_0.4.0 [37] GenomicAlignments_1.33.1 catalogueR_1.0.0 echoplot_0.99.5 [40] chron_2.3-57 xtable_1.8-4 ggnetwork_0.5.10 [43] magrittr_2.0.3 evaluate_0.16 cli_3.3.0 [46] zlibbioc_1.43.0 rstudioapi_0.14 miniUI_0.1.1.1 [49] rpart_4.1.16 echoannot_0.99.7 ensembldb_2.21.4 [52] treeio_1.21.2 shiny_1.7.2 xfun_0.32 [55] BSgenome.Hsapiens.1000genomes.hs37d5_0.99.1 pkgbuild_1.3.1 cluster_2.1.3 [58] echoconda_0.99.7 KEGGREST_1.37.3 interactiveDisplayBase_1.35.0 [61] expm_0.999-6 ggrepel_0.9.1 SNPlocs.Hsapiens.dbSNP155.GRCh37_0.99.22 [64] biovizBase_1.45.0 ape_5.6-2 echodata_0.99.12 [67] png_0.1-7 reshape_0.8.9 withr_2.5.0 [70] bitops_1.0-7 RBGL_1.73.0 plyr_1.8.7 [73] cellranger_1.1.0 AnnotationFilter_1.21.0 e1071_1.7-11 [76] pillar_1.8.1 cachem_1.0.6 GenomicFeatures_1.49.6 [79] fs_1.5.2 googleAuthR_2.0.0 echoLD_0.99.7 [82] osfr_0.2.8 snpStats_1.47.1 vctrs_0.4.1 [85] ellipsis_0.3.2 generics_0.1.3 gsubfn_0.7 [88] devtools_2.4.4 tools_4.2.0 foreign_0.8-82 [91] munsell_0.5.0 susieR_0.12.27 proxy_0.4-27 [94] DelayedArray_0.23.1 abind_1.4-5 fastmap_1.1.0 [97] compiler_4.2.0 pkgload_1.3.0 httpuv_1.6.5 [100] ExperimentHub_2.5.0 sessioninfo_1.2.2 ewceData_1.5.0 [103] plotly_4.10.0 DescTools_0.99.46 GenomeInfoDbData_1.2.8 [106] gridExtra_2.3 lattice_0.20-45 dir.expiry_1.5.0 [109] deldir_1.0-6 utf8_1.2.2 later_1.3.0 [112] BiocFileCache_2.5.0 jsonlite_1.8.0 GGally_2.1.2 [115] scales_1.2.1 gld_2.6.5 graph_1.75.0 [118] tidytree_0.4.0 carData_3.0-5 lazyeval_0.2.2 [121] promises_1.2.0.1 car_3.1-0 RCircos_1.2.2 [124] latticeExtra_0.6-30 R.utils_2.12.0 reticulate_1.26 [127] checkmate_2.1.0 rmarkdown_2.16 openxlsx_4.2.5 [130] dichromat_2.0-0.1 Biobase_2.57.1 igraph_1.3.4 [133] survival_3.3-1 yaml_2.3.5 htmltools_0.5.3 [136] memoise_2.0.1 VariantAnnotation_1.43.3 profvis_0.3.7 [139] BiocIO_1.7.1 supraHex_1.35.0 viridisLite_0.4.1 [142] digest_0.6.29 assertthat_0.2.1 mime_0.12 [145] piggyback_0.1.3 rappdirs_0.3.3 dnet_1.1.7 [148] downloadR_0.99.4 RSQLite_2.2.16 sqldf_0.4-11 [151] yulab.utils_0.0.5 Exact_3.1 remotes_2.4.2 [154] orthogene_1.3.2 urlchecker_1.0.1 blob_1.2.3 [157] R.oo_1.25.0 splines_4.2.0 Formula_1.2-4 [160] googledrive_2.0.0 AnnotationHub_3.5.0 OrganismDbi_1.39.1 [163] ProtGenerics_1.29.0 RCurl_1.98-1.8 broom_1.0.1 [166] hms_1.1.2 gprofiler2_0.2.1 modelr_0.1.9 [169] colorspace_2.0-3 base64enc_0.1-3 BiocManager_1.30.18 [172] aplot_0.1.6 echofinemap_0.99.3 nnet_7.3-17 [175] Rcpp_1.0.9 mvtnorm_1.1-3 fansi_1.0.3 [178] tzdb_0.3.0 brio_1.1.3 R6_2.5.1 [181] grid_4.2.0 crul_1.2.0 lifecycle_1.0.1 [184] rootSolve_1.8.2.3 zip_2.2.0 MungeSumstats_1.5.13 [187] ggsignif_0.6.3 curl_4.3.2 googlesheets4_1.0.1 [190] minqa_1.2.4 testthat_3.1.4 XGR_1.1.8 [193] Matrix_1.4-1 desc_1.4.1 ggbio_1.45.0 [196] RColorBrewer_1.1-3 htmlwidgets_1.5.4 biomaRt_2.53.2 [199] gridGraphics_0.5-1 MAGMA.Celltyping_2.0.7 rvest_1.0.3 [202] lmom_2.9 htmlTable_2.4.1 patchwork_1.1.2 [205] codetools_0.2-18 matrixStats_0.62.0 lubridate_1.8.0 [208] EWCE_1.5.7 prettyunits_1.1.1 SingleCellExperiment_1.19.0 [211] dbplyr_2.2.1 basilisk.utils_1.9.2 R.methodsS3_1.8.2 [214] gtable_0.3.1 DBI_1.1.3 ggfun_0.0.7 [217] httr_1.4.4 stringi_1.7.8 progress_1.2.2 [220] reshape2_1.4.4 viridis_0.6.2 hexbin_1.28.2 [223] Rgraphviz_2.41.1 ggtree_3.5.3 DT_0.24 [226] xml2_1.3.3 ggdendro_0.1.23 boot_1.3-28 [229] lme4_1.1-30 restfulr_0.0.15 RNOmni_1.0.1 [232] interp_1.1-3 ggplotify_0.1.0 homologene_1.4.68.19.3.27 [235] BiocVersion_3.16.0 bit_4.0.4 jpeg_0.1-9 [238] MatrixGenerics_1.9.1 babelgene_22.3 pkgconfig_2.0.3 [241] gargle_1.2.0 rstatix_0.7.0 knitr_1.40 ```
bschilder commented 1 year ago

This is happening Rfast is a Suggest in susieR, not an Import (even though it probably should be). A fix for this would be adding Rfast as an Import for echofinemap, so it automatically gets installed. Will implement this now.