satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.26k stars 908 forks source link

Is it possible to run integrated analysis on scATAC-seq data? #1860

Closed jk86754 closed 5 years ago

jk86754 commented 5 years ago

Hi,

Is it possible to run integrated analysis similar to https://satijalab.org/seurat/v3.0/immune_alignment.html on scATAC seq data? Please see my attempt below with the error produced. I followed https://satijalab.org/seurat/v3.0/atacseq_integration_vignette.html for the ATAC part.

peaks.d10 <- Read10X_h5("filtered_peak_bc_matrix.h5") activity.matrix.d10 <- CreateGeneActivityMatrix(peak.matrix = peaks.d10, annotation.file = "../../Mus_musculus.GRCm38.97.gtf", seq.levels = c(1:19, "X", "Y"), upstream = 2000, verbose = TRUE) pbmc.atac.d10 <- CreateSeuratObject(counts = peaks.d10, assay = "ATAC", project = "d10_ATAC") pbmc.atac.d10[["ACTIVITY"]] <- CreateAssayObject(counts = activity.matrix.d10) meta.d10 <- read.table("singlecell.csv", sep = ",", header = TRUE, row.names = 1, stringsAsFactors = FALSE) meta.d10 <- meta.d10[colnames(pbmc.atac.d10), ] pbmc.atac.d10 <- AddMetaData(pbmc.atac.d10, metadata = meta.d10) pbmc.atac.d10.filtered <- subset(pbmc.atac.d10, subset = nCount_ATAC > 2000) pbmc.atac.d10.filtered$tech <- "atac" DefaultAssay(pbmc.atac.d10.filtered) <- "ACTIVITY" pbmc.atac.d10.filtered <- FindVariableFeatures(pbmc.atac.d10.filtered) pbmc.atac.d10.filtered <- NormalizeData(pbmc.atac.d10.filtered) pbmc.atac.d10.filtered <- ScaleData(pbmc.atac.d10.filtered) DefaultAssay(pbmc.atac.d10.filtered) <- "ATAC" VariableFeatures(pbmc.atac.d10.filtered) <- >names(which(Matrix::rowSums(pbmc.atac.d10.filtered) > 100)) pbmc.atac.d10.filtered <- RunLSI(pbmc.atac.d10.filtered, n = 50, scale.max = NULL) pbmc.atac.d10.filtered <- RunTSNE(pbmc.atac.d10.filtered, reduction = "lsi", dims = 1:50) DimPlot(pbmc.atac.d10.filtered, reduction = "tsne") + NoLegend() + ggtitle("scATAC-seq")

repeat for additional time points

############################################

integration

immune.anchors <- FindIntegrationAnchors(object.list = list(pbmc.atac.filtered, pbmc.atac.d3.filtered, pbmc.atac.d7.filtered, pbmc.atac.d10.filtered), dims = 1:20)

Computing 2000 integration features Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 'x' must be atomic

sessionInfo() R version 3.5.1 (2018-07-02) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)

Matrix products: default BLAS: /sc/wo/app/R/v3.5.1/lib64/R/lib/libRblas.so LAPACK: /sc/wo/app/R/v3.5.1/lib64/R/lib/libRlapack.so

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] parallel stats graphics grDevices utils datasets methods base

other attached packages: [1] ggplot2_3.2.0 doParallel_1.0.14 iterators_1.0.10 foreach_1.4.4 Seurat_3.0.2

loaded via a namespace (and not attached): [1] Rtsne_0.15 colorspace_1.4-1 ggridges_0.5.1
[4] XVector_0.22.0 GenomicRanges_1.34.0 rstudioapi_0.10
[7] listenv_0.7.0 npsurv_0.4-0 ggrepel_0.8.1
[10] bit64_0.9-7 codetools_0.2-16 splines_3.5.1
[13] R.methodsS3_1.7.1 lsei_1.2-0 jsonlite_1.6
[16] Rsamtools_1.32.3 ica_1.0-2 cluster_2.1.0
[19] png_0.1-7 R.oo_1.22.0 sctransform_0.2.0
[22] compiler_3.5.1 httr_1.4.0 assertthat_0.2.1
[25] Matrix_1.2-17 lazyeval_0.2.2 htmltools_0.3.6
[28] tools_3.5.1 rsvd_1.0.1 igraph_1.2.2
[31] gtable_0.3.0 glue_1.3.1 GenomeInfoDbData_1.2.0
[34] RANN_2.6 reshape2_1.4.3 dplyr_0.8.1
[37] Rcpp_1.0.1 Biobase_2.42.0 Biostrings_2.48.0
[40] gdata_2.18.0 ape_5.2 nlme_3.1-140
[43] rtracklayer_1.40.6 gbRd_0.4-11 lmtest_0.9-37
[46] stringr_1.4.0 globals_0.12.4 irlba_2.3.3
[49] gtools_3.8.1 XML_3.98-1.20 future_1.14.0
[52] MASS_7.3-51.4 zlibbioc_1.28.0 zoo_1.8-6
[55] scales_1.0.0 SummarizedExperiment_1.10.1 RColorBrewer_1.1-2
[58] yaml_2.2.0 reticulate_1.10 pbapply_1.4-1
[61] gridExtra_2.3 stringi_1.4.3 S4Vectors_0.20.1
[64] caTools_1.17.1.1 BiocGenerics_0.28.0 BiocParallel_1.14.2
[67] bibtex_0.4.2 GenomeInfoDb_1.18.2 Rdpack_0.11-0
[70] SDMTools_1.1-221.1 rlang_0.4.0 pkgconfig_2.0.2
[73] bitops_1.0-6 matrixStats_0.54.0 lattice_0.20-38
[76] ROCR_1.0-7 purrr_0.3.2 labeling_0.3
[79] GenomicAlignments_1.18.1 htmlwidgets_1.3 cowplot_1.0.0
[82] bit_1.1-14 tidyselect_0.2.5 plyr_1.8.4
[85] magrittr_1.5 R6_2.4.0 IRanges_2.16.0
[88] gplots_3.0.1.1 DelayedArray_0.8.0 withr_2.1.2
[91] pillar_1.4.2 fitdistrplus_1.0-14 survival_2.44-1.1
[94] RCurl_1.95-4.12 tibble_2.1.3 future.apply_1.3.0
[97] tsne_0.1-3 crayon_1.3.4 hdf5r_1.0.1
[100] KernSmooth_2.23-15 plotly_4.9.0 grid_3.5.1
[103] data.table_1.12.2 metap_1.1 digest_0.6.19
[106] tidyr_0.8.2 R.utils_2.9.0 stats4_3.5.1
[109] munsell_0.5.0

Thanks,

Joe

timoast commented 5 years ago

Hi Joe, this should work if you use the RNA assay, so you should just make sure the DefaultAssay is RNA for all of the objects. You could also try running harmony. Harmony currently only seems to allow running on the PCA, but if you just renamed the LSI dimension reduction to "PCA" I expect it would work, although I have not tested this.

timoast commented 5 years ago

This is now possible through the Signac package

liuxuesunny commented 4 years ago

Hi There, I am trying to integrate three 10x scATAC samples by harmony following the Vignettes, but after integration and running FindNeighbors and FindClusters, the clusters were not well separated and cells from different clusters (color labelled) joined each like the image below. I also attached the code below. A newbie needs help. Thanks.

umap_E17_5_lutr

load("E17_5_7_lutr.rds") load("E17_5_8_lutr.rds") load("E17_5_9_lutr.rds")

E17_5_7_lutr$tech <- 'E17_5_7_lutr' E17_5_7_lutr$celltype <- Idents(E17_5_7_lutr)

E17_5_8_lutr$tech <- 'E17_5_8_lutr' E17_5_8_lutr$celltype <- Idents(E17_5_8_lutr)

E17_5_9_lutr$tech <- 'E17_5_9_lutr' E17_5_9_lutr$celltype <- Idents(E17_5_9_lutr)

intersecting.regions <- GetIntersectingFeatures( object.1 = E17_5_7_lutr, object.2 = E17_5_8_lutr, sep.1 = c(":", "-"), sep.2 = c(":", "-") ) peaks.use <- sample(intersecting.regions[[1]], size = 10000, replace = FALSE)

intersecting.regions <- GetIntersectingFeatures( object.1 = unintegrated1_2, object.2 = E17_5_9_lutr, sep.1 = c(":", "-"), sep.2 = c(":", "-") ) peaks.use <- sample(intersecting.regions[[1]], size = 10000, replace = FALSE)

unintegrated12_3 <- MergeWithRegions( object.1 = unintegrated1_2, object.2 = E17_5_9_lutr, assay.1 = 'peaks', assay.2 = 'peaks', sep.1 = c(":", "-"), sep.2 = c(":", "-"), )

unintegrated12_3 <- RunTFIDF(unintegrated12_3) unintegrated12_3 <- FindTopFeatures(unintegrated12_3, min.cutoff = 50) unintegrated12_3 <- RunSVD(unintegrated123, n = 30, reduction.name = 'lsi', reduction.key = 'LSI') unintegrated12_3 <- RunUMAP(unintegrated12_3, reduction = 'lsi', dims = 2:30, min.dist = 0.8) DimPlot(unintegrated12_3, pt.size = 1.0) + ggplot2::ggtitle("Unintegrated") DimPlot(unintegrated12_3, group.by = 'tech', pt.size = 1.0) + ggplot2::ggtitle("Unintegrated") table(Idents(unintegrated12_3))

library(harmony)

E17_5_lutr <- RunHarmony( object = unintegrated12_3, group.by.vars = 'tech', reduction = 'lsi', assay.use = 'peaks', project.dim = FALSE )

E17_5_lutr <- RunUMAP(E17_5_lutr, dims = 2:30, reduction = 'harmony', min.dist = 0.8) E17_5_lutr <- FindNeighbors( E17_5_lutr, reduction = 'harmony', dims = 2:30 ) E17_5_lutr <- FindClusters( E17_5_lutr, algorithm = 3, resolution = 1.3, verbose = T ) DimPlot(E17_5_lutr, pt.size = 1.0, label = T)