Once the single cell multi-omics data are decomposed into multiple biologically relevant factors, the package provides functionality for further data exploration, analysis, and visualization. Users can:
Check out our paper (Suoqin Jin#, Lihua Zhang# & Qing Nie*, Genome Biology, 2020) for the detailed methods and applications.
scAI has been implemented as both R package and MATLAB package under the license GPL-3. In each package, we provide example workflows that outline the key steps and unique features of scAI. The MATLAB package and examples are available here.
devtools::install_github("sqjin/scAI")
Download source codes here and type (in R)
install.packages(path_to_file, type = 'source', rep = NULL) # The path_to_file would represent the full path and file name
This website shows other ways for building and installing an R package.
All the R markdown used to generate the walkthroughs can be found under the /examples directory.
object <- run_scAI(object, K, do.fast = TRUE)
Feature selection can reduce the running time in both scAI model and downstream analysis such as dimension reduction.
The most informative genes can be selected based on their average expression and Fano factor (see our paper for details).
object <- selectFeatures(object, assay = "RNA")
object <- run_scAI(object, K, do.fast = TRUE, hvg.use1 = TRUE)
Unlike scRNA-seq data, the largely binary nature of scATAC-seq data makes it challenging to perform ‘variable’ feature selection. One option is to select the nearby chromsome regions of the informative genes.
object <- selectFeatures(object, assay = "RNA")
loci.use <- searchGeneRegions(genes = object@var.features[[1]], species = "mouse")
object@var.features[[2]] <- loci.use
object <- run_scAI(object, K, do.fast = TRUE, hvg.use1 = TRUE, hvg.use2 = TRUE)
Another option is to use only the top n% of features or remove features present in less that n cells. This method is used in Signac.
Please consider install RcppEigen and rfunctions if they are not automatically installed.
if(!require(devtools)){ install.packages("devtools")}
install.packages("RcppEigen")
devtools::install_github("jaredhuling/rfunctions")
Troubleshooting: Installing RcppEigen and rfunctions on R>=3.5 requires Clang >= 6 and gfortran-6.1. For MacOS, it's recommended to follow guidance on the official R page here OR the post. For Windows, please ensure that Rtools is installed.
Install other dependencies
scAI provides functionality for further data exploration, analysis, and visualization. A couple of excellent packages need to be installed.
library(devtools)
install_github('linxihui/NNLM')
install_github("yanwu2014/swne")
install_github("jokergoo/ComplexHeatmap")
Install Leiden algorithm for identifying cell clusters: pip install leidenalg. Please check here if there is any trouble.
Install UMAP and FIt-SNE for faster dimension reduction in reducedDims
Using UMAP and FIt-SNE is recommended for computational efficiency when using reducedDims
on very large datasets.
-- install UMAP Python package: pip install umap-learn. Please check here if there is any trouble.
-- install FIt-SNE R package: Installing and compiling the necessary software requires the use of FIt-SNE and FFTW. For detailed instructions of installation, please visit this page.
If you get the error "clang: error: unsupported option '-fopenmp'" when installing R package, please consider the configuration in ~/.R/Makevars and see this post for detailed configuration. In addition, you may can also reinstall your R because -fopenmp option is usually added by R automatically if openmp is available.
If you are using macOS Mojave Version (10.14) and you might get the error "/usr/local/clang6/bin/../include/c++/v1/math.h:301:15: fatal error: 'math.h' file not found", please check the post. This error can be solved if running the following on the terminal:
sudo installer -pkg \
/Library/Developer/CommandLineTools/Packages/macOS_SDK_headers_for_macOS_10.14.pkg \
-target /
If you have any problems, comments or suggestions, please contact us at Suoqin Jin (suoqin.jin@uci.edu) or Lihua Zhang (lihuaz1@uci.edu).
Jin, S., Zhang, L. & Nie, Q. scAI: an unsupervised approach for the integrative analysis of parallel single-cell transcriptomic and epigenomic profiles. Genome Biol 21, 25 (2020). https://doi.org/10.1186/s13059-020-1932-8