Closed LunaP-831 closed 1 year ago
ind.atac@meta.data <- meta
Never assign anything to the object using the @
operator, as this bypasses all the checks that we do when setting information in the object, and you can end up with an invalid object. Same goes for accessing information, best to avoid accessing slots directly using @
. Here you should use the AddMetaData()
function instead.
I suspect the metadata overwrite that you're doing is part of the problem, as you changed the rownames of that dataframe and they may not match the cell names or order used in the rest of the object.
If the merge is taking days, that's a sign that something is off. I'd suggest skipping the whole part where you're trying to change the metadata and see if that works as expected, to narrow down what the issue is. Also make sure you're using the same assay name for the ATAC object and the ATAC assay in the multiome object.
You should also update to the latest release of Signac (1.9.0)
Hello,
I have been trying to integrate my multiome object and my scATAC-seq data from 10X genomics. The multiome has already been analyzed so I was just trying to merge the multiome with scATAC data and then integrated these datasets. I am following this vignette :https://stuartlab.org/signac/articles/integrate_atac.html
However, after I run the merging command I got this error :
Then I have seen this post basically modified my scATAC object as such
After this, the merging took a few days without giving me any errors and was completed but then I wanted to group the merged by "dataset" and plot the information on a UMAP. However, when I tried this it gave the following error.
I am wondering if there is a problem with the steps I am taking and if so how can I improve these. Thank you!
Please find my code and session info attached.
`###### load libraries library(Signac) library(Seurat) library(pbapply) library(future) library(ggplot2)
plan("multicore", workers = 86) options(future.globals.maxSize = 100 * 1024^3)
set seed
set.seed(1234)
hg38 annotation
annotation <- readRDS("/mnt/labcode/src/0.1.hg38_annotation.rds")
read the scATAC-seq aggregate counts
counts_atac <-Read10X_h5("/mnt/aggregated/outs/filtered_peak_bc_matrix.h5")
path to fragments
fragpath <- "/mnt/aggregated/outs/fragments.tsv.gz"
count fragments per cell
fragcounts <- CountFragments(fragments = fragpath)
read the file where the cell information is stored
human.atac <- read.delim("/mnt/aggregated/tables/cells_atac.txt") ##28926 cells
atac cells
cells.atac <-human.atac$barcode
create the fragment object
atac.frags <- CreateFragmentObject(path = fragpath, cells = cells.atac)
read in the multiome object
multiome <- readRDS("/mnt/multiome/Analyzed_multiome.rds")
quantify multiome peaks in the scATAC-seq dataset
counts <- FeatureMatrix( fragments = atac.frags, features = granges(multiome), cells = cells.atac )
chromatin assay
atac.assay <- CreateChromatinAssay( counts = counts, fragments = atac.frags, min.cells = 10, min.features = 200 )
Seurat object
ind.atac <- CreateSeuratObject(counts = atac.assay, assay = "peaks") #351352 28926
compute LSI
ind.atac <- FindTopFeatures(ind.atac, min.cutoff = 'q0') ind.atac <- RunTFIDF(ind.atac) ind.atac <- RunSVD(ind.atac)
add dataset info
ind.atac$dataset <- "ATAC" multiome$dataset <- "Multiome"
add sample info to the scAtac object
ind.atac@meta.data$barcode <- rownames(ind.atac@meta.data) samples_atac <- read.csv("/mnt/aggregated/outs/aggregation_csv.csv") atac.cells <- as.data.frame(row.names(ind.atac@meta.data)) colnames(atac.cells) <- "barcodes" rownames(atac.cells) <- atac.cells$barcodes atac.cells$libcodes <- as.factor(gsub(pattern=".+-", replacement="", atac.cells$barcodes)) atac.cells$samples <- as.vector(samples_atac$library_id[atac.cells$libcodes]) sampleidentity <- atac.cells["samples"] ind.atac@meta.data$sample <- sampleidentity[row.names(ind.atac@meta.data),]
meta <- ind.atac@meta.data meta$libcodes <- meta$barcode
change the numbers at the end of barcodes with gsub #to match the ordering
meta$libcodes <- ifelse(endsWith( meta$barcode,"1"),gsub("1","9", meta$barcode) , meta$libcodes) meta$libcodes <- ifelse(endsWith( meta$barcode,"2"),gsub("2","10", meta$barcode) , meta$libcodes)
fix the dataframe according to right barcodes
row.names(meta) <- meta$libcodes meta$barcode <- meta$libcodes meta$libcodes <- as.factor(gsub(pattern=".+-", replacement="", meta$barcode))
assign it back to the individual object
ind.atac@meta.data <- meta
ind.atac@active.ident <- factor(rep(ind.atac@meta.data$sample, length(rownames(ind.atac@meta.data)))) names(ind.atac@active.ident) <- rownames(ind.atac@meta.data)
merge
combined <- merge(ind.atac, multiome)
process the combined dataset
combined <- FindTopFeatures(combined, min.cutoff = 10) combined <- RunTFIDF(combined) combined <- RunSVD(combined) combined <- RunUMAP(combined, reduction = "lsi", dims = 2:50)
the step I am getting the error
figure_file<-paste0(getwd(),"scATAC_merged_dataset.pdf") pdf(figure_file, width = 15, height = 10, useDingbats=FALSE) p1 <- DimPlot(combined, group.by = "dataset") p1 dev.off() `
`
sessionInfo()
R version 4.0.5 (2021-03-31) Platform: x86_64-conda-linux-gnu (64-bit) Running under: Ubuntu 22.04.2 LTS
Matrix products: default BLAS/LAPACK: /mnt/RAID/anaconda3/envs/cloned_env/lib/libopenblasp-r0.3.20.so
locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] future_1.27.0 pbapply_1.5-0 sp_1.5-0 SeuratObject_4.1.0 [5] Seurat_4.1.1 Signac_1.2.1
loaded via a namespace (and not attached): [1] fastmatch_1.1-3 plyr_1.8.7 igraph_1.3.0
[4] lazyeval_0.2.2 splines_4.0.5 BiocParallel_1.24.1
[7] listenv_0.8.0 scattermore_0.8 SnowballC_0.7.0
[10] GenomeInfoDb_1.26.4 ggplot2_3.3.6 digest_0.6.29
[13] htmltools_0.5.3 fansi_1.0.3 magrittr_2.0.3
[16] tensor_1.5 cluster_2.1.3 ROCR_1.0-11
[19] globals_0.15.1 Biostrings_2.58.0 matrixStats_0.62.0
[22] docopt_0.7.1 spatstat.sparse_2.1-1 colorspace_2.0-3
[25] ggrepel_0.9.1 dplyr_1.0.9 sparsesvd_0.2
[28] crayon_1.5.1 RCurl_1.98-1.7 jsonlite_1.8.0
[31] progressr_0.10.1 spatstat.data_2.2-0 survival_3.3-1
[34] zoo_1.8-10 glue_1.6.2 polyclip_1.10-0
[37] gtable_0.3.0 zlibbioc_1.36.0 XVector_0.30.0
[40] leiden_0.4.2 future.apply_1.9.0 BiocGenerics_0.36.0
[43] abind_1.4-5 scales_1.2.0 DBI_1.1.3
[46] spatstat.random_2.2-0 miniUI_0.1.1.1 Rcpp_1.0.9
[49] viridisLite_0.4.0 xtable_1.8-4 reticulate_1.25
[52] spatstat.core_2.4-4 bit_4.0.4 stats4_4.0.5
[55] htmlwidgets_1.5.4 httr_1.4.3 RColorBrewer_1.1-3
[58] ellipsis_0.3.2 ica_1.0-3 pkgconfig_2.0.3
[61] farver_2.1.1 ggseqlogo_0.1 uwot_0.1.11
[64] deldir_1.0-6 utf8_1.2.2 tidyselect_1.1.2
[67] rlang_1.0.4 reshape2_1.4.4 later_1.2.0
[70] munsell_0.5.0 tools_4.0.5 cli_3.3.0
[73] generics_0.1.3 ggridges_0.5.3 stringr_1.4.0
[76] fastmap_1.1.0 goftest_1.2-3 bit64_4.0.5
[79] fitdistrplus_1.1-8 purrr_0.3.4 RANN_2.6.1
[82] nlme_3.1-158 mime_0.12 slam_0.1-50
[85] RcppRoll_0.3.0 hdf5r_1.3.5 compiler_4.0.5
[88] plotly_4.10.0 png_0.1-7 spatstat.utils_2.3-1
[91] tibble_3.1.8 tweenr_1.0.2 stringi_1.7.6
[94] RSpectra_0.16-1 rgeos_0.5-9 lattice_0.20-45
[97] Matrix_1.4-1 vctrs_0.4.1 pillar_1.8.0
[100] lifecycle_1.0.1 spatstat.geom_2.4-0 lmtest_0.9-40
[103] RcppAnnoy_0.0.19 data.table_1.14.2 cowplot_1.1.1
[106] bitops_1.0-7 irlba_2.3.5 httpuv_1.6.5
[109] patchwork_1.1.1 GenomicRanges_1.42.0 R6_2.5.1
[112] promises_1.2.0.1 KernSmooth_2.23-20 gridExtra_2.3
[115] lsa_0.73.3 IRanges_2.24.1 parallelly_1.32.1
[118] codetools_0.2-18 MASS_7.3-58 assertthat_0.2.1
[121] qlcMatrix_0.9.7 sctransform_0.3.3 Rsamtools_2.6.0
[124] S4Vectors_0.28.1 GenomeInfoDbData_1.2.4 mgcv_1.8-40
[127] parallel_4.0.5 grid_4.0.5 rpart_4.1.16
[130] tidyr_1.2.0 Rtsne_0.16 ggforce_0.3.3
[133] shiny_1.7.2
`