echolocatoR fails to find module scipy. I get the error "ModuleNotFoundError: No module named 'scipy'" when extracting Linkage Disequilibrium. I am running the latest version of echolocatoR, v2.0.3. Any advice would be much appreciated!
2. Reproducible example
Code
#Map column names in summary stats
columnsnames = echodata::construct_colmap(munged= FALSE,
CHR = "CHR", POS = "POS",
SNP = "SNP", P = "pvalue",
Effect = "beta", StdErr = "SE",
A1 = "A1", A2 = "A2", Freq = "freq",
N = "N")
#Fine mapping
results <- echolocatoR::finemap_loci(
topSNPs = topSNPs,
loci = topSNPs$Locus,
LD_reference = "UKB", #using UK Biobank for LD reference panel
dataset_name = "mortality_GWAS",
fullSS_genome_build = "hg19",
case_control = FALSE,
finemap_methods = c("ABF","SUSIE","FINEMAP"),
force_new_subset = TRUE,
force_new_LD = TRUE,
force_new_finemap = TRUE,
# SNP filters
bp_distance = 1000000, #distance around the lead SNP to include (1Mb, +/- 500kb)
min_MAF = 0.001,
# Munge full sumstats first
munged = FALSE,
fullSS_path = "/Users/manuela/Documents/Work/survival_GWAS/echolocatoR/summary_stats/mortalityGWAS_summaryStats_forMunge.txt",
colmap = columnsnames,
#Plot options
plot_types = c("fancy"), #in addition to GWAS and fine mapping tracks, plot XGR annotation tracks - XGR, Roadmap, Nott2019
show_plot = TRUE,
zoom = c("1x", "4x", "10x"))
Console output
[1] "+ Assigning Gene and Locus independently."
Standardising column headers.
First line of summary statistics file:
SNP CHR POS P Effect StdErr Freq A1 A2 N Locus Gene
Returning unmapped column names without making them uppercase.
+ Mapping colnames from MungeSumstats ==> echolocatoR
┌────────────────────────────────────────────┐
│ │
│ )))> 🦇 ANKRD55 [locus 1 / 10] 🦇 <((( │
│ │
└────────────────────────────────────────────┘
──────────────────────────────────────────────────────────────────────────────────
── Step 1 ▶▶▶ Query 🔎 ───────────────────────────────────────────────────────────
──────────────────────────────────────────────────────────────────────────────────
+ Query Method: tabix
Constructing GRanges query using min/max ranges within a single chromosome.
query_dat is already a GRanges object. Returning directly.
========= echotabix::convert =========
Converting full summary stats file to tabix format for fast querying.
Inferred format: 'table'
Explicit format: 'table'
Inferring comment_char from tabular header: 'CHR'
Determining chrom type from file header.
Chromosome format: 1
Detecting column delimiter.
Identified column separator: \t
Sorting rows by coordinates via bash.
Searching for header row with grep.
( grep ^'CHR' .../mortalityGWAS_summaryStats_forMunge.txt; grep
-v ^'CHR' .../mortalityGWAS_summaryStats_forMunge.txt | sort
-k1,1n
-k2,2n ) > .../file16c12683ada49_sorted.tsv
Constructing outputs
Using existing bgzipped file: /Users/manuela/Documents/Work/survival_GWAS/echolocatoR/summary_stats/mortalityGWAS_summaryStats_forMunge.txt.bgz
Set force_new=TRUE to override this.
Tabix-indexing file using: Rsamtools
Data successfully converted to bgzip-compressed, tabix-indexed format.
========= echotabix::query =========
query_dat is already a GRanges object. Returning directly.
Inferred format: 'table'
Querying tabular tabix file using: Rsamtools.
Checking query chromosome style is correct.
Chromosome format: 1
Retrieving data.
Converting query results to data.table.
Processing query: 5:54533638-56533638
Adding 'query' column to results.
Retrieved data with 6,058 rows
Saving query ==> /var/folders/gs/pbd9rgqs6jg963j_g70phh3h0000gn/T//RtmpsoB6CR/results/GWAS/mortality_GWAS/ANKRD55/ANKRD55_mortality_GWAS_subset.tsv.gz
+ Query: 6,058 SNPs x 15 columns.
Standardizing summary statistics subset.
Standardizing main column names.
++ Preparing A1,A1 cols
++ Preparing MAF,Freq cols.
++ Inferring MAF from frequency column.
++ Removing SNPs with MAF== 0 | NULL | NA or >1.
++ Preparing N_cases,N_controls cols.
++ Preparing proportion_cases col.
++ proportion_cases not included in data subset.
Preparing sample size column (N).
Using existing 'N' column.
+ Imputing t-statistic from Effect and StdErr.
+ leadSNP missing. Assigning new one by min p-value.
++ Ensuring Effect,StdErr,P are numeric.
++ Ensuring 1 SNP per row and per genomic coordinate.
++ Removing extra whitespace
+ Standardized query: 6,058 SNPs x 18 columns.
++ Saving standardized query ==> /var/folders/gs/pbd9rgqs6jg963j_g70phh3h0000gn/T//RtmpsoB6CR/results/GWAS/mortality_GWAS/ANKRD55/ANKRD55_mortality_GWAS_subset.tsv.gz
──────────────────────────────────────────────────────────────────────────────────
── Step 2 ▶▶▶ Extract Linkage Disequilibrium 🔗 ──────────────────────────────────
──────────────────────────────────────────────────────────────────────────────────
LD_reference identified as: ukb.
Using UK Biobank LD reference panel.
+ UKB LD file name: chr5_54000001_57000001
Downloading full .gz/.npz UKB files and saving to disk.
echoconda:: conda already installed.
Retrieving conda env name from yaml: echoR_mini
echoconda:: Conda environment already exists: echoR_mini
Searching for 1 package(s) across 1 conda environment(s).
Listing all packages in environment: echoR_mini
1 unique package(s) found across 1 conda environment(s).
Downloading with axel [1 thread(s)]: https://data.broadinstitute.org/alkesgroup/UKBB_LD/chr5_54000001_57000001.gz ==> /var/folders/gs/pbd9rgqs6jg963j_g70phh3h0000gn/T//RtmpsoB6CR/results/GWAS/mortality_GWAS/ANKRD55/LD/chr5_54000001_57000001.gz
+ Overwriting pre-existing file.
axel download successful.
Time difference of 4.1 secs
echoconda:: conda already installed.
Retrieving conda env name from yaml: echoR_mini
echoconda:: Conda environment already exists: echoR_mini
Searching for 1 package(s) across 1 conda environment(s).
Listing all packages in environment: echoR_mini
1 unique package(s) found across 1 conda environment(s).
Downloading with axel [1 thread(s)]: https://data.broadinstitute.org/alkesgroup/UKBB_LD/chr5_54000001_57000001.npz ==> /var/folders/gs/pbd9rgqs6jg963j_g70phh3h0000gn/T//RtmpsoB6CR/results/GWAS/mortality_GWAS/ANKRD55/LD/chr5_54000001_57000001.npz
+ Overwriting pre-existing file.
axel download successful.
Time difference of 15.7 secs
ModuleNotFoundError: No module named 'scipy'
Locus ANKRD55 complete in: 1.28 min
1. Bug description
echolocatoR fails to find module scipy. I get the error "ModuleNotFoundError: No module named 'scipy'" when extracting Linkage Disequilibrium. I am running the latest version of echolocatoR, v2.0.3. Any advice would be much appreciated!
2. Reproducible example
Code
Console output
3. Session info