NathanSkene / EWCE

Expression Weighted Celltype Enrichment. See the package website for up-to-date instructions on usage.
https://nathanskene.github.io/EWCE/index.html
53 stars 25 forks source link

DelayedArray, Orthologues, Ingest data, Plots, Normalize, Utilities #23

Closed bschilder closed 3 years ago

bschilder commented 3 years ago

DelayedArray enhancements

Previously, generate.celltype.data() and drop.uninformative.genes() ran into memory problems, described in this issue. These enhancements permit the analysis of arbitrarily large datasets by only pulling chunks of the full data into memory at a time. Operations on those chunks can then be parallelized across cores.

filter_nonorthologues

This new function (and its support functions) is located within filter.genes.without.1to1.homology.r.

ingest_data

Added new function ingest_data() to import scRNAseq datasets across a wide range of formats. Once ingested, these datasets can be passed along to other EWCE functions that can recognize and use the SingleCellExperiment format.

  1. Takes obj, which can be either a path to a dataset file, or the object itself.
  2. Infers the file format or object class, and imports it
  3. Converts the object to a SingleCellExperiment with a sparse DelayedArray.
  4. Optionally allows you to save the SingleCellExperiment as anHDF5SummarizedExperiment, which allows you to interact with the dataset without loading the full thing into memory at once, and/or take advantage of DelayedArray chunking functions.

Current formats accepted:

plot_gene_metrics

New function in plot_gene_metrics.R.

find_celltype_markers

New function in find_celltype_markers.R.

scT_normalize

Located in scT_normalize.R.

Utility functions

New general support functions:

Package documentation

Added package.R, which gives users a general description of EWCE when they type ?EWCE.

DESCRIPTION

Updated Suggests/Imports/Remotes to support new functions. When possible, kept in Suggests to minimize EWCE installation time.

NathanSkene commented 3 years ago

Hi Brian, just looked into merging this and saw that the travis checks have failed, can you look into getting them so they complete?

Also, are there any small example files for the new loading functions? Would be good if there were, and if loading if them was tested as part of the vignette (in a manner that people can see where the example files are).

bschilder commented 3 years ago

Made several fixes:

Made several upgrades as well:

NathanSkene commented 3 years ago

New changes sound great, very welcome improvements to the package!

Looks like the latest commits are still failing in Travis. @Al-Murphy is working on various improvements to the package at the moment, so I suggest we merge this as follows:

Al-Murphy commented 3 years ago

Hey @bschilder, @NathanSkene,

Just to note, my tests and package script changes have been pushed to the dev branch and passed the travis tests.

bschilder commented 3 years ago

Hmm, looks likebiomaRt installation is failing. Not sure why though, I don't think I change anything about howbiomaRt is installed in the DESCRIPTION file (it's still listed under Imports:).

Error: package or namespace load failed for ‘biomaRt’:
 .onLoad failed in loadNamespace() for 'biomaRt', details:
  call: httr::set_config(new_config, override = FALSE)
  error: object 'new_config' not found
Error: loading failed
Execution halted
ERROR: loading failed
* removing ‘/home/travis/R/Library/biomaRt’
Error in i.p(...) : 
  (converted from warning) installation of package ‘biomaRt’ had non-zero exit status
Calls: <Anonymous> ... with_rprofile_user -> with_envvar -> force -> force -> i.p
Execution halted
The command "Rscript -e 'deps <- remotes::dev_package_deps(dependencies = NA);remotes::install_deps(dependencies = TRUE);if (!all(deps$package %in% installed.packages())) { message("missing: ", paste(setdiff(deps$package, installed.packages()), collapse=", ")); q(status = 1, save = "no")}'" failed and exited with 1 during .

However it does look like some of the other checks failed because I forgot to include glmGamPoi in Remotes. I've added that and pushed again.

NathanSkene commented 3 years ago

Looks like someone else reported a similar issue recently: https://github.com/grimbough/biomaRt/issues/40

On Wed, 27 Jan 2021 at 15:23, Brian M. Schilder notifications@github.com wrote:

This email from notifications@github.com originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list https://spam.ic.ac.uk/SpamConsole/Senders.aspx to disable email stamping for this address.

Hmm, looks likebiomaRt installation is failing. Not sure why though, I don't think I change anything about howbiomaRt is installed in the DESCRIPTION file (it's still listed under Imports:).

Error: package or namespace load failed for ‘biomaRt’:

.onLoad failed in loadNamespace() for 'biomaRt', details:

call: httr::set_config(new_config, override = FALSE)

error: object 'new_config' not found

Error: loading failed

Execution halted

ERROR: loading failed

  • removing ‘/home/travis/R/Library/biomaRt’

Error in i.p(...) :

(converted from warning) installation of package ‘biomaRt’ had non-zero exit status

Calls: ... with_rprofile_user -> with_envvar -> force -> force -> i.p

Execution halted

The command "Rscript -e 'deps <- remotes::dev_package_deps(dependencies = NA);remotes::install_deps(dependencies = TRUE);if (!all(deps$package %in% installed.packages())) { message("missing: ", paste(setdiff(deps$package, installed.packages()), collapse=", ")); q(status = 1, save = "no")}'" failed and exited with 1 during .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanSkene/EWCE/pull/23#issuecomment-768360180, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH5ZPEYWWN3H5SJFYZZX3NTS4AVXRANCNFSM4VKNJZVQ .

bschilder commented 3 years ago

I see! So I suppose I could switch biomaRt installation to this remote branch in the DESCRIPTION (github::grimbough/biomaRt@3_12_testing), though the better long-term solution might be to just wait for them to push the fixed version to BioConductor. Do you have a preference @NathanSkene?

NathanSkene commented 3 years ago

I've set them building again on travis, sounds like it should solve itself shortly so let's leave it for now

On Wed, 27 Jan 2021 at 15:59, Brian M. Schilder notifications@github.com wrote:

This email from notifications@github.com originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list https://spam.ic.ac.uk/SpamConsole/Senders.aspx to disable email stamping for this address.

I see! So I suppose I could switch biomaRt installation to this remote branch in the DESCRIPTION (github::grimbough/biomaRt@3_12_testing), though the better long-term solution might be to just wait for them to push the fixed version to BioConductor. Do you have a preference @NathanSkene https://github.com/NathanSkene?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NathanSkene/EWCE/pull/23#issuecomment-768385625, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH5ZPE6IMSCQ3V7HHI2XTL3S4AZ45ANCNFSM4VKNJZVQ .

NathanSkene commented 3 years ago

Think there is a new error:

Warning: replacing previous import ‘biomaRt::select’ by ‘dplyr::select’ when loading ‘EWCE’
Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : 
  there is no package called ‘glmGamPoi’
Calls: <Anonymous> ... loadNamespace -> withRestarts -> withOneRestart -> doWithOneRestart
Execution halted
ERROR: lazy loading failed for package ‘EWCE’
* removing ‘/private/var/folders/nz/vv4_9tw56nv9k3tkvyszvwg80000gn/T/RtmpvN8MPj/Rinst1765912603482/EWCE’*

Btw, rather than commiting again you can just go to https://travis-ci.org/github/NathanSkene/EWCE. Then click the 'restart' button, shown below:

image

bschilder commented 3 years ago

@NathanSkene good to know, thanks!

That's weird, I just encountered this same error when trying to install the DelayedArray branch on the Rstudio server. But it shouldn't be coming up since I specify it as a remote in the DESCRIPTION:

Remotes:
    github::ChristophH/sctransform@*release,
    github::mojaveazure/seurat-disk,
    github::cellgeni/sceasy,
    github::const-ae/glmGamPoi

I checked and glmGamPoi does indeed install fine when you run in R: devtools::install_github("const-ae/glmGamPoi") Then installing EWCE@DelayedArray is fine. Not sure why it's not working via the EWCE installation directly.

There also seems to be an R version conflict with Seurat. Currrent version on CRAN is R>=4.0, but there's def older versions that work for R=3.6. Not sure why it can't figure out to just use those. Could try to be more explicit in the Seurat version, but it's also a large package to install for just a couple of examples in data_ingestion(). I'll remove it from Suggests and let the user decide if they need to install it.

* DONE (sctransform)
Installing 111 packages: Rcpp, rappdirs, jsonlite, xfun, rlang, base64enc, digest, mime, stringi, magrittr, glue, markdown, highr, stringr, tinytex, evaluate, htmltools, yaml, knitr, rmarkdown, gtable, plyr, parallelly, listenv, globals, future, colorspace, utf8, assertthat, vctrs, pkgconfig, pillar, lifecycle, fansi, ellipsis, crayon, cli, tibble, rematch, prettyunits, rematch2, diffobj, rstudioapi, pkgbuild, rprojroot, waldo, ps, processx, praise, pkgload, desc, callr, brio, viridisLite, RColorBrewer, R6, munsell, labeling, farver, testthat, withr, scales, isoband, fastmap, sys, askpass, cachem, bit, plogr, BH, memoise, bit64, openssl, generics, tidyselect, purrr, blob, httr, curl, DBI, RSQLite, dplyr, dbplyr, hms, S4Vectors, IRanges, Biobase, BiocGenerics, reticulate, BiocManager, bookdown, RcppArmadillo, matrixStats, gridExtra, reshape2, ggplot2, future.apply, progress, cellranger, BiocFileCache, AnnotationDbi, anndata, Seurat, BiocStyle, RNOmni, ggdendro, HGNChelper, readxl, cowplot, limma, biomaRt
Installing packages into ‘/Users/travis/R/Library’
(as ‘lib’ is unspecified)
Error: (converted from warning) package ‘Seurat’ is not available (for R version 3.6.3)
Execution halted
The command "Rscript -e 'deps <- remotes::dev_package_deps(dependencies = NA);remotes::install_deps(dependencies = TRUE);if (!all(deps$package %in% installed.packages())) { message("missing: ", paste(setdiff(deps$package, installed.packages()), collapse=", ")); q(status = 1, save = "no")}'" failed and exited with 1 during .
Your build has been stopped.