bschilder commented 11 months ago

Rerun the phenotype-level gene set enrichment analyses with EWCE using :

[x] Descartes Human: after gene symbol standardisation with EWCE https://github.com/neurogenomics/MultiEWCE/issues/14
Human Cell Landscape: single-cell human fetal and adult dataset

bschilder commented 10 months ago

When I ran MultiEWCE only ~500/6700 phenotypes completed. The rest seem to have errored out.

The ones I inspected manually all seemed to be due to the "must at least 4 genes" error. But this is weird bc I explicitly check beforehand which gene lists have enough genes (after taking into consideration the intersect with the CTD genes). MultiEWCE itself has an additional check for the number of valid gene lists, and even in situations where MultiEWCE determined the gene list was sufficient, EWCE would return this error.

Potential sources of the issue

1. Outdated conda env

One of the issues can be traced back to an old version of EWCE:

We've made a number of fixes to EWCE over time
I only had v1.6 installed (very old).
I couldn't install a newer version bc R>=4.3 is required
I couldn't install R>=4.3 bc it hadn't been released via conda yet.
I've always relied on using conda envs on HPC when running jobs (there may be a way using Docker+Singularity, but I haven't figured out how to do this within job submissions just yet).

I've just updated R to 4.3 in my "bioc" env and then updated all downstream dependants. Will rerun the enrichment analyses and see if this helps.

2. GitHub API limit

Just noticed this error randomly started locally while I was running the enrichments tests on HPC.

I'm currently downloading and storing the CTD and HPO gene list for each job to try and avoid any issues due to reading multiple nodes jobs in resources from the same file at the same time.

However, I think this introduced a second error; hitting the GitHub API download limit, since all the resources are stored on GitHub Releases via piggyback. This also explains why i didn't see this error in the earliest queries i checked manually after the conda fix.

Prioritising gene targets.
Adding term definitions.
Error in `value[[3L]]()`:
! Cannot access release data for repo "neurogenomics/HPOExplorer".
Check that you have provided a `.token` and that the repo is correctly specified.
GitHub API error (403): API rate limit exceeded for user ID 34280215. If you reach out to GitHub Support for help,
please include the request ID E118:AC81:6DA236:6EC564:65453130.

I could:

Go back to using the same files for all jobs.
Store one local copy and have the jobs copy over from those "local" files on HPC.

This could have also been due to some temporary outages of GitHub that have been going on the last hour or so.

3. Not enough memory

Noticed another error on some runs:

Registered S3 method overwritten by 'ggtree':
  method         from     
  fortify.igraph ggnetwork
Validating gene lists..
1 / 1 gene lists are valid.
=>> PBS: job killed: mem 21688436kb exceeded limit 20971520kb

Easy fix bump up memory requested (currently only using 1 core, 20Gb per job).

4. Within-R parallelisation

Parallelising within R can cause issues bc it copies the entire R environment for each thread. i think this is why were so easily reaching memory limits on some jobs. Allocating 4 cores/job but turning OFF within-R parallelisation via MultiEWCE::gen_results(cores = 1) may help.

NathanSkene commented 10 months ago

Once you’ve got this sussed, worth adding a test to catch related problems in the future

Sent from Outlook for iOShttps://aka.ms/o0ukef

From: Brian M. Schilder @.> Sent: Friday, November 3, 2023 3:27:00 PM To: neurogenomics/RareDiseasePrioritisation @.> Cc: Subscribed @.***> Subject: Re: [neurogenomics/RareDiseasePrioritisation] Repeat phenotype-level enrichment analyses using EWCE (Issue #29)

This email from @.*** originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders listhttps://spam.ic.ac.uk/SpamConsole/Senders.aspx to disable email stamping for this address.

When I ran MultiEWCE only ~500/6700 phenotypes completed. The rest seem to have errored out.

The ones I inspected manually all seemed to be due to the "must at least 4 genes" error. But this is weird bc I explicitly check beforehand which gene lists have enough genes (after taking into consideration the intersect with the CTD genes). MultiEWCE itself has an additional check for the number of valid gene lists, and even in situations where MultiEWCE determined the gene list was sufficient, EWCE would return this error.

Potential sources of the issue Outdated conda env

One of the issues can be traced back to an old version of EWCE:

We've made a number of fixes to EWCE over time
I only had v1.6 installed (very old).
I couldn't install a newer version bc R>=4.3 is required
I couldn't install R>=4.3 bc it hadn't been released via condahttps://anaconda.org/conda-forge/r-base/files?page=2 yet.
I've always relied on using conda envs on HPC when running jobs (there may be a way using Docker+Singularity, but I haven't figured out how to do this within job submissions just yet).

I've just updated R to 4.3 in my "bioc" env and then updated all downstream dependants. Will rerun the enrichment analyses and see if this helps.

— Reply to this email directly, view it on GitHubhttps://github.com/neurogenomics/RareDiseasePrioritisation/issues/29#issuecomment-1792647831, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AH5ZPE2I652FS3562BJPPW3YCUEMJAVCNFSM6AAAAAA6MHLQL2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJSGY2DOOBTGE. You are receiving this because you are subscribed to this thread.Message ID: @.***>

bschilder commented 10 months ago

1. Outdated conda env

Making all the updates seems to fix certain cases!

I just ran the following successfully, which was failing previously:

#!/usr/bin/env Rscript
library("optparse")

option_list <-list(
  optparse::make_option(c("-i", "--idx"), type="integer", 
                        help="PBS_ARRAY_INDEX", metavar="character"),
  optparse::make_option( c("-n", "--ncpus"), type="integer", default=1,
                         help="Number of CPUs to use.", metavar="character"),
  optparse::make_option( c("-b", "--batches"), type="integer", default=7015,
                         help="Number of total batches.", metavar="character")
);
opt_parser <- optparse::OptionParser(option_list=option_list)
opt <- optparse::parse_args(opt_parser)
root <- "/rds/general/project/neurogenomics-lab/ephemeral/rare_disease"

library(MultiEWCE)
library(data.table)
save_dir <- file.path(root,paste0("BATCH_IDX_",opt$idx))
storage_dir <- file.path(save_dir,"resources")
ctd <- MultiEWCE::load_example_ctd(file = "ctd_DescartesHuman.rds", 
                                   save_dir = storage_dir)
#### Load and filter gene data ####
annotLevel <- 2
gene_data <- HPOExplorer::load_phenotype_to_genes(save_dir=storage_dir)
gene_data <- gene_data[gene_symbol %in% rownames(ctd[[annotLevel]]$mean_exp)]
gene_data[,n_gene:=(length(unique(gene_symbol))),by="hpo_id"]
gene_data <- gene_data[n_gene>=4,]
#### Split HPO IDs into N chunks ####
ids <- unique(gene_data$hpo_id)
chunks <- split(ids, cut(seq_along(ids),opt$batches,labels = FALSE))

#### Run enrichment analyses #####
all_results <- MultiEWCE::gen_results(
  ctd = ctd,
  list_name_column = "hpo_id",
  list_names = chunks[[99]],
  gene_data = gene_data,
  annotLevel = annotLevel,
  reps = 100,
  cores = opt$ncpus,
  force_new = TRUE,
  save_dir = save_dir)

Validating gene lists..
1 / 1 gene lists are valid.
Analysing: 'HP:0008186' (1/1): 4 genes.
Computing gene counts.

Saving results ==> /rds/general/project/neurogenomics-lab/ephemeral/rare_disease/BATCH_IDX_/gen_results_2023-11-03_16-39-27.19555.rds

Just launched the pbs script and all 7015 valid phenotypes (with enough genes) should be done in under an hour or so.

2. GitHub API limit

Using same resources for all jobs seems to help.

3. Not enough memory

Bumped up to 4 cores and 90Gb, seems to help and not much difference in queuing time.

4. Within-R parallelisation

Avoiding this didn't seem to make a difference in this case.

bschilder commented 10 months ago

Still 203 phenotypes missing, all seemingly due to insufficient memory.

Going for a brute force approach and increasing memory to the max (before it goes to a different queue). Thanks to @Al-Murphy for sharing these med-bio queue specs.

#PBS -l walltime=36:00:00
#PBS -l select=1:ncpus=40:mem=128gb -q med-bio
#PBS -J 1-7015
module load anaconda3/personal
source activate bioc
cd $PBS_O_WORKDIR

Rscript /rds/general/project/neurogenomics-lab/live/Projects/rare_disease_ewce/pbs/rare_disease_celltyping.R -i $PBS_ARRAY_INDEX -n 1 -b 7015

bschilder commented 10 months ago

Something I'm noticing about these 203 missing phenotypes is that they all seem to have fairly large gene lists. Not sure exactly where in MultiEWCE/EWCE this is causing memory usage to increase significantly, but i think this is the underlying reason for the failures:

r$> missing_dat
            hpo_id                            hpo_name ncbi_gene_id gene_symbol   disease_id n_gene batch_id
     1: HP:0002650                           Scoliosis         5290      PIK3CA ORPHA:276280   1047        4
     2: HP:0002650                           Scoliosis         5290      PIK3CA  OMIM:612918   1047        4
     3: HP:0002650                           Scoliosis         5290      PIK3CA    ORPHA:201   1047        4
     4: HP:0002650                           Scoliosis         5290      PIK3CA  OMIM:615108   1047        4
     5: HP:0002650                           Scoliosis        26235       FBXL4  OMIM:615471   1047        4
    ---                                                                                                     
396184: HP:0000818 Abnormality of the endocrine system         3559       IL2RA  OMIM:606367   1365     6904
396185: HP:0000818 Abnormality of the endocrine system         3418        IDH2 ORPHA:163634   1365     6904
396186: HP:0000818 Abnormality of the endocrine system         3417        IDH1 ORPHA:163634   1365     6904
396187: HP:0000818 Abnormality of the endocrine system        54820        NDE1   ORPHA:2177   1365     6904
396188: HP:0000818 Abnormality of the endocrine system        57192      MCOLN1  OMIM:252650   1365     6904

r$> unique(missing_dat[,c("hpo_name","n_gene")])$n_gene|>summary()
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    736    1073    1378    1648    1957    4887

bschilder commented 10 months ago

Reducing memory load

I've further reduced the memory usage by subsetting the CTD and gene data objects (the two largest objects in each R session) to the bare minimum information. ~~We'll see how much this helps get results for all phenotypes once HPC finishes the array job (currently 4499/7015 phenotypes complete)~~ All subjobs are now finished but still 99 phenotypes errored out due to insufficient memory (even with optimising memory usage and 128gb of memory!). Will try using a different high-memory queue for these remaining phenotypes.

Choosing background

I've also added some function to MultiEWCE to allow it to query gprofiler via orthogene once, and use cached results thereafter.

See here for a full discussion on the choice of background genes:

https://github.com/neurogenomics/MultiEWCE/issues/16

bschilder commented 10 months ago

I submitted a PBS job yesterday afternoon requesting the large memory nodes. As of today...it's still queued. @Al-Murphy warned me about this, that requesting large mem nodes on HPC can take forever...

#PBS -l walltime=72:00:00
#PBS -l select=1:ncpus=64:mem=230gb
#PBS -J 1-4
module load anaconda3/personal
source activate bioc
cd $PBS_O_WORKDIR

Rscript /rds/general/project/neurogenomics-lab/live/Projects/rare_disease_ewce/pbs/rare_disease_celltyping.R -i $PBS_ARRAY_INDEX -n 1 -b 4

In the meantime, I'm going to try running these remaining phenotypes on the Threadripper, as it has 252Gb memory which should be sufficient to complete these final 99 phenotypes with large gene lists.

Side note: For the remaining 99 phenotypes, i wanted to see what levels in the HPO ontology they tended to be. And while there are many high-level terms ("phenotypic abnormality", "Abnormality of the nervous system") the levels span from 0-16, showing that's it's not just these highest-level phenotypes that we're missing.

NathanSkene commented 10 months ago

How much RAM did the previous machines have? Had no idea EWCE could use that much memory

bschilder commented 10 months ago

How much RAM did the previous machines have? Had no idea EWCE could use that much memory

@NathanSkene by "previous machines" do you mean when Bobby ran it? I don't know how he ran it, but i can ask.

I was also surprised by this, I still haven't narrowed down where it's coming from (MultiEWCE, EWCE, my PBS script, HPC peculiarity)

NathanSkene commented 10 months ago

I meant, the machines that crashed giving memory errors. Did the same jobs crash consistently?

bschilder commented 10 months ago

I meant, the machines that crashed giving memory errors. Did the same jobs crash consistently?

Yes, it was very consistently the same phenotypes with the same exact error about going over the 128Gb memory limit

ephemeral <- "/rds/general/project/neurogenomics-lab/ephemeral/rare_disease/" 
ephemeral_logs <- file.path(dirname(ephemeral),"rare_disease.pbs_output")
missing_dat <- readRDS(here::here("pbs/missing_dat.rds"))
logs <- system(paste("ls",ephemeral_logs,"-Artlsh | tail -10"), intern = TRUE)
jobID <- gsub("[a-z]","",rev(strsplit(tail(logs,1),"\\.")[[1]])[2])
logs_df <- data.table::data.table(batch_id=unique(missing_dat$batch_id))[,
  path:=file.path(ephemeral_logs,
                 paste("rare_disease_celltyping.pbs",
                       paste0("e",jobID),batch_id,sep=".")
  )
][file.exists(path),]
## Print the logs for each batch
out <- lapply(logs_df$path,function(x){
  message("\n~~~~~~~~~~",basename(x),"~~~~~~~~~~")
  readLines(x)|>cat(sep = "\n")
})

Log outputs

``` ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.1443~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 138422412kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.1460~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 135437080kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.1461~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 155038736kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.1508~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 141449780kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.1533~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 161415432kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.1616~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 187137828kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.1682~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 175242656kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.1700~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 176104484kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.1809~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 243969128kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.1933~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 150846192kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.2036~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 160181132kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.2042~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 194027064kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.2044~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 166230420kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.2082~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 136677196kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.2113~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 168733704kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.2210~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 142033920kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.2320~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 134415276kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.2368~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 137460848kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.2377~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 152930456kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.2429~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 176648844kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.2587~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 135055864kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.2597~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 177235860kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.2612~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 198486632kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.2857~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 200423652kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.2858~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 142869184kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.2859~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 147004404kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.2885~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 138268828kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.2917~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 155995000kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.2930~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 151641848kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.3661~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 209088492kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.3666~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 148041136kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.3738~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 175085476kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.3833~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 134719872kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.3843~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 168809976kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.3894~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 179083832kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.3895~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 149281816kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.3932~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 154031984kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.3947~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 188771984kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4008~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 140109704kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4083~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 184386020kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4084~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 166543440kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4134~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 167064928kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4135~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 135000028kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4192~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 145662348kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4325~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 187121204kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4366~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 135340924kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4375~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 148173656kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4414~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 147667060kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4454~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 139562508kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4508~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 138557036kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4515~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 143668292kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4516~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 205978100kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4556~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 167311476kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4557~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 306816508kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4580~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 137932860kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4608~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 152803500kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4611~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 152900808kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4659~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 139205948kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4706~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 134950236kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4707~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 144384476kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4796~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 158683152kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4797~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 286799756kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4799~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 139928052kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.4800~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 262658644kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.5138~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 143196312kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.5372~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 134348984kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.5565~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 144739276kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.5710~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 163846876kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.5730~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 140331468kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.5751~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 140010424kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.5836~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 144564940kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.5906~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 152678144kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.5956~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 142020820kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.5964~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 234191872kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.5995~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 135266360kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.6187~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 147522336kb exceeded limit 134217728kb ~~~~~~~~~~rare_disease_celltyping.pbs.e8434576.6975~~~~~~~~~~ Registered S3 method overwritten by 'ggtree': method from fortify.igraph ggnetwork Reading cached RDS file: phenotype_to_genes.txt + Version: v2023-10-09 Validating gene lists.. 1 / 1 gene lists are valid. Useing cached bg. + Version: 2023-11-07 Background contains 62,663 genes. =>> PBS: job killed: mem 138271300kb exceeded limit 134217728kb ```

bschilder commented 10 months ago

I have these 99 phenotypes running on the threadripper right now, parallelised across 20/64 cores. Seems to working fine there. If for some reason it does prove too much doing this in parallel on Threadripper (still waiting for it to complete), I can make everything single-threaded to give all 252Gb of mem to each subjob.

On the Threadripper, this comes to 252Gb/20jobs = 20Gb memory/subjob. Far less than the 128Gb/subjob than I was using on HPC! This leads me to suspect something weird is going on with HPC.

It's possible the memory problems on HPC were due to:

the fact that I activate the same conda env in each subjob
the fact that each subjob is accessing the same resources (CTD, HPO gene list) over and over in parallel
some weird unknown interaction with MultiEWCE/EWCE

bschilder commented 10 months ago

Spoke too soon, all the subjobs on the Threadripper crashed due to hitting the shared memory limit. Usage was relatively low at first (~113Gb) but then climbed up to 250Gb+ before crashing.

Rerunning now single-threaded to reduce memory load.

While that's running, I'm going to launch the Human Cell Landscape round of analyses on HPC!

bschilder commented 10 months ago

Ok, so I was able to run the remaining phenotypes but only by reducing the number of iterations from 100k to 10k. While i don't recommend using inconsistent n iterations, this gives me a clue about what might be happening.

One of the upgrades I made in EWCE 1.7.3 was storing the bootstrapped data for every iteration. It's possible that this becomes too much data to keep in memory with 100k iterations and a very large hit genes list.

Return gene-level scores based on adaptation of code from generate_bootstrap_plots. now stored as a list element named gene_data in data.table format.

https://github.com/NathanSkene/EWCE/blob/master/NEWS.md#ewce-173

I'll test this by removing the stored bootstrapped data, but will need to make some mods in EWCE to enable this feature.

Side note; the high mem jobs on HPC are still queued....after waiting for over 4 days... 😑

bschilder commented 10 months ago

One of the upgrades I made in EWCE 1.7.3 was storing the bootstrapped data for every iteration. It's possible that this becomes too much data to keep in memory with 100k iterations and a very large hit genes list.

This was exactly the issue! I've added a new arg to EWCE::bootstrap_enrichment_test called store_gene_data= (and exposed it to MultiEWCE via ... notation). This lets users turn off computing gene scores.

So why did this explode the memory? I believe it's because the data size explodes when you have a large number of reps hits celltypes. For example, running EWCE with 100k reps, a hit gene list of length 2000, and 77 celltypes would results in a dataframe with >15 billion rows!:

> xfun::numbers_to_words(100000*2000*77)
[1] "fifteen billion, four hundred million"

Setting store_gene_data=FALSE avoids this and thus drastically reduces memory usage. Using this arg, I'm now running 10 subjobs at once on the threadripper and the max memory usage is stable at around 80-90Gb. So that means only ~8-9Gb / subjob; a huge difference from when each one of these subjobs were crashing a machine with 128Gb!

NathanSkene commented 10 months ago

Great! Good to have that sorted!

bschilder commented 10 months ago

Was just about to generate a report of the new results, but then our Private Cloud crashed right as I was about to do so.

I also need the Private Cloud to regenerate the Human Cell Landscape CTD after making some modifications to the levels.

@eduff is working on getting it back up and running.

bschilder commented 9 months ago

Private Cloud is back up, and some additional issues with disk storage filling up completely are now resolved. This allowed me to finish up the last of the DescartesHuman analyses, and launch the HumanCellLandscape analyses!

DescartesHuman results: old vs. new

Results summary

Here's a high level summary, comparing the old (Bobby's CTD + old HPO data) vs. new results (using the standardised CTD, + updated HPO data):

Total phenotypes: 6,173 vs. 7,016
Total tests: 475,321 vs. 540,232
Significant tests @ FDR<0.05: 8,379 (1.76%) vs. 17,069 (3.16%)
Significant phenotypes @ FDR<0.05: 2,832 (45%) vs. 2482 (35.4%)
Significant phenotypes @ uncorrected P<0.05: 5,989 (97%) vs. 6740 (96.1%)
Diseases covered via significant phenotypes: 8,025 vs. 12,361 (99.1% of 12,468 total rare diseases!)

So to summarise, we have 2x the number of sig enrichment test, but ~12% fewer sig phenotypes.

Specific results

Most importantly, I think the results look a lot more consistently believable (fewer suspect associations, e.g. enteric neurons being associated with intellectual disability).

Here are the significant cell type associations with the top fold-change per phenotype: As you can see, the cell types make a lot of sense! Screenshot 2023-12-09 at 00 14 58

DescartesHuman_top_res.csv.gz

Next steps

HumanCellLandscape

HPC is undergoing some maintenance apparently (don't recall it being scheduled for today), but as soon as they're done with this my HumanCellLandscape analyses will automatically start running and should be done <24h after that. Screenshot 2023-12-08 at 18 50 33

In the meantime, I'm working on mapping the 124 cell types in HumanCellLandscape to the 77 cell types in DescartesHuman. Looking for a computational/systematic way to do this, but if that proves too do this, but if that proves too involved I'll do the matching manually.

bschilder commented 9 months ago

HumanCellLandscape results finished last week. will report comparisons between HumanCellLandscape vs. DescartesHuman results here:

https://github.com/neurogenomics/rare_disease_celltyping/issues/47

neurogenomics / RareDiseasePrioritisation

Repeat phenotype-level enrichment analyses using EWCE #29

Potential sources of the issue

1. Outdated conda env

2. GitHub API limit

3. Not enough memory

4. Within-R parallelisation

1. Outdated conda env

2. GitHub API limit

3. Not enough memory

4. Within-R parallelisation

Reducing memory load

Choosing background

Log outputs

DescartesHuman results: old vs. new

Results summary

Specific results

Next steps

HumanCellLandscape