stuart-lab / signac

R toolkit for the analysis of single-cell chromatin data
https://stuartlab.org/signac/
Other
331 stars 88 forks source link

Issues with Integrating scATAC and scRNA seq Datasets #5809 #1063

Closed carversh closed 2 years ago

carversh commented 2 years ago

Hello! I am trying to integrate two sc datasets and am having issues with the integration. The datasets obtained are both pre-processed. Here is my code:

# making chromatin assay to ATAC seurat object
# get gene annotations for hg38
annotation <- GetGRangesFromEnsDb(ensdb = EnsDb.Hsapiens.v86)
seqlevelsStyle(annotation) <- "UCSC"
# reading in processed atac matrix
m <- readMM(file='/n/scratch3/users/s/shc989/matrix.mtx')
m <- m*1

# creating barcode and feature labels
barcodes <- fread(file='barcodes_atac.csv', header=F)[[1]]
features <- fread(file='peaks.csv', header=F)[[1]]

# adding these feature and barcode names to the matrix
rownames(m) <- features
colnames(m) <- barcodes

# Define GRanges object using the features
granges <- StringToGRanges(features, sep = c(":", "-"))
granges <- granges[as.vector(seqnames(granges) %in% standardChromosomes(granges)),]

# Create Seurat Chromatin Assay
chrom_assay <- CreateChromatinAssay(
  counts = m,
  sep = c(":", "-"),
  ranges = granges,
  genome = 'hg38',
  fragments = 'fragments.tsv.gz',
  min.cells = 0,
  min.features = 0,
  annotation = annotation
)

# read in metadata
metadata <- read.csv(
  file = "snATAC_metadta.csv",
  header = TRUE,
  row.names = 1
)

morab.atac <- CreateSeuratObject(
  counts = chrom_assay,
  assay = "peaks",
  meta.data = metadata
)

# process atac data

# We exclude the first dimension as this is typically correlated with sequencing depth
morab.atac <- RunTFIDF(morab.atac)
morab.atac <- FindTopFeatures(morab.atac, min.cutoff = "q0")
morab.atac <- RunSVD(morab.atac)
morab.atac <- RunUMAP(morab.atac, reduction = "lsi", dims = 2:30, reduction.name = "umap.atac", reduction.key = "atacUMAP_")
# making rna seq object
##loading in 10x processed data, scaling and saving as RDS file
data_dir <- '/home/shc989/Gusev_Lab/Real_Data/data.dir'
list.files(data_dir) # Should show barcodes.tsv, genes.tsv, and matrix.mtx
expression_matrix <- Read10X(data.dir = data_dir, gene.column=1, strip.suffix = TRUE)

# read in metadata
meta = read.csv(file = '/n/scratch3/users/s/shc989/snRNA_metadta.csv')

# create seurat object
morab.rna = CreateSeuratObject(counts = expression_matrix)

morab.rna <- NormalizeData(morab.rna)
morab.rna <- FindVariableFeatures(morab.rna)
morab.rna <- ScaleData(morab.rna)
morab.rna <- RunPCA(morab.rna)
morab.rna <- RunUMAP(morab.rna, dims = 1:30)

# add metadata to seurat object
meta = read.csv(file = 'snRNA_metadta.csv', header = TRUE, row.names = 1)
morab.rna <- AddMetaData(object =  morab.rna, metadata = meta)

Now, when I go to integrate the data I get the following error and am not sure what to do: gene.activities <- GeneActivity(morab.atac, features = VariableFeatures(morab.rna))

`Error in names(cell.convert) <- cells: attempt to set an attribute on NULL Traceback:

  1. GeneActivity(morab.atac, features = VariableFeatures(morab.rna))
  2. FeatureMatrix(fragments = frags, features = transcripts, cells = cells, . verbose = verbose, ...)
  3. sapply(X = obj.use, FUN = function(x) { . SingleFeatureMatrix(fragment = fragments[[x]], features = features, . cells = cells, sep = sep, verbose = verbose, process_n = process_n) . })
  4. lapply(X = X, FUN = FUN, ...)
  5. FUN(X[[i]], ...)
  6. SingleFeatureMatrix(fragment = fragments[[x]], features = features, . cells = cells, sep = sep, verbose = verbose, process_n = process_n)`

I would really appreciate help on this! Thank you in advance.

carversh commented 2 years ago

Session info:

R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /n/app/openblas/0.2.19/lib/libopenblas_core2p-r0.2.19.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] BSgenome.Hsapiens.UCSC.hg38_1.4.4 BSgenome_1.62.0                  
 [3] rtracklayer_1.54.0                Biostrings_2.62.0                
 [5] XVector_0.34.0                    EnsDb.Hsapiens.v86_2.99.0        
 [7] ensembldb_2.18.3                  AnnotationFilter_1.18.0          
 [9] GenomicFeatures_1.46.5            AnnotationDbi_1.56.2             
[11] bcbioRNASeq_0.4.0                 SummarizedExperiment_1.24.0      
[13] Biobase_2.54.0                    GenomicRanges_1.46.1             
[15] GenomeInfoDb_1.30.1               IRanges_2.28.0                   
[17] S4Vectors_0.32.3                  BiocGenerics_0.40.0              
[19] MatrixGenerics_1.6.0              matrixStats_0.61.0               
[21] ggplot2_3.3.5                     SeuratDisk_0.0.0.9019            
[23] data.table_1.14.2                 Matrix_1.4-0                     
[25] SeuratObject_4.0.4                Seurat_4.1.0                     
[27] Signac_1.5.0                     

loaded via a namespace (and not attached):
  [1] utf8_1.2.2                  reticulate_1.24            
  [3] tidyselect_1.1.2            RSQLite_2.2.10             
  [5] htmlwidgets_1.5.4           grid_4.1.1                 
  [7] docopt_0.7.1                BiocParallel_1.28.3        
  [9] Rtsne_0.15                  AcidPlots_0.4.0            
 [11] munsell_0.5.0               codetools_0.2-18           
 [13] ica_1.0-2                   pbdZMQ_0.3-7               
 [15] future_1.24.0               miniUI_0.1.1.1             
 [17] AcidGenerics_0.6.0          withr_2.5.0                
 [19] spatstat.random_2.1-0       colorspace_2.0-3           
 [21] pipette_0.8.0               filelock_1.0.2             
 [23] knitr_1.37                  uuid_1.0-3                 
 [25] SingleCellExperiment_1.16.0 ROCR_1.0-11                
 [27] tensor_1.5                  listenv_0.8.0              
 [29] slam_0.1-50                 tximport_1.22.0            
 [31] repr_1.1.4                  GenomeInfoDbData_1.2.7     
 [33] polyclip_1.10-0             bit64_4.0.5                
 [35] farver_2.1.0                parallelly_1.30.0          
 [37] vctrs_0.3.8                 generics_0.1.2             
 [39] xfun_0.30                   BiocFileCache_2.2.1        
 [41] lsa_0.73.2                  ggseqlogo_0.1              
 [43] R6_2.5.1                    locfit_1.5-9.5             
 [45] hdf5r_1.3.5                 cachem_1.0.6               
 [47] bitops_1.0-7                spatstat.utils_2.3-0       
 [49] DelayedArray_0.20.0         assertthat_0.2.1           
 [51] promises_1.2.0.1            BiocIO_1.4.0               
 [53] scales_1.1.1                gtable_0.3.0               
 [55] globals_0.14.0              processx_3.5.2             
 [57] goftest_1.2-3               AcidExperiment_0.3.0       
 [59] rlang_1.0.2                 genefilter_1.76.0          
 [61] AcidMarkdown_0.1.6          RcppRoll_0.3.0             
 [63] splines_4.1.1               lazyeval_0.2.2             
 [65] spatstat.geom_2.4-0         yaml_2.3.5                 
 [67] reshape2_1.4.4              abind_1.4-5                
 [69] httpuv_1.6.5                syntactic_0.5.2            
 [71] tools_4.1.1                 ellipsis_0.3.2             
 [73] spatstat.core_2.4-0         RColorBrewer_1.1-2         
 [75] goalie_0.6.0                sessioninfo_1.2.2          
 [77] ggridges_0.5.3              Rcpp_1.0.8.3               
 [79] plyr_1.8.7                  progress_1.2.2             
 [81] base64enc_0.1-3             zlibbioc_1.40.0            
 [83] purrr_0.3.4                 RCurl_1.98-1.6             
 [85] prettyunits_1.1.1           ps_1.6.0                   
 [87] rpart_4.1-15                deldir_1.0-6               
 [89] pbapply_1.5-0               cowplot_1.1.1              
 [91] zoo_1.8-9                   ggrepel_0.9.1              
 [93] cluster_2.1.2               magrittr_2.0.2             
 [95] scattermore_0.8             lmtest_0.9-40              
 [97] RANN_2.6.1                  SnowballC_0.7.0            
 [99] ProtGenerics_1.26.0         fitdistrplus_1.1-8         
[101] AcidCLI_0.2.0               hms_1.1.1                  
[103] patchwork_1.1.1             mime_0.12                  
[105] AcidBase_0.5.0              evaluate_0.15              
[107] xtable_1.8-4                XML_3.99-0.9               
[109] sparsesvd_0.2               gridExtra_2.3              
[111] biomaRt_2.50.3              compiler_4.1.1             
[113] tibble_3.1.6                KernSmooth_2.23-20         
[115] crayon_1.5.1                htmltools_0.5.2            
[117] mgcv_1.8-36                 later_1.3.0                
[119] geneplotter_1.72.0          tidyr_1.2.0                
[121] DBI_1.1.2                   tweenr_1.0.2               
[123] dbplyr_2.1.1                rappdirs_0.3.3             
[125] MASS_7.3-54                 cli_3.2.0                  
[127] parallel_4.1.1              AcidSingleCell_0.2.0       
[129] igraph_1.2.11               pkgconfig_2.0.3            
[131] GenomicAlignments_1.30.0    IRdisplay_1.1              
[133] plotly_4.10.0               spatstat.sparse_2.1-0      
[135] xml2_1.3.3                  annotate_1.72.0            
[137] AcidGenomes_0.3.0           stringr_1.4.0              
[139] digest_0.6.29               sctransform_0.3.3          
[141] RcppAnnoy_0.0.19            spatstat.data_2.1-4        
[143] AcidPlyr_0.2.0              leiden_0.3.9               
[145] fastmatch_1.1-3             edgeR_3.36.0               
[147] uwot_0.1.11                 restfulr_0.0.13            
[149] curl_4.3.2                  shiny_1.7.1                
[151] Rsamtools_2.10.0            rjson_0.2.21               
[153] lifecycle_1.0.1             nlme_3.1-152               
[155] jsonlite_1.8.0              limma_3.50.1               
[157] viridisLite_0.4.0           fansi_1.0.3                
[159] pillar_1.7.0                lattice_0.20-44            
[161] KEGGREST_1.34.0             fastmap_1.1.0              
[163] httr_1.4.2                  survival_3.2-11            
[165] glue_1.6.2                  qlcMatrix_0.9.7            
[167] png_0.1-7                   bit_4.0.4                  
[169] bcbioBase_0.7.0             ggforce_0.3.3              
[171] stringi_1.7.6               blob_1.2.2                 
[173] DESeq2_1.34.0               memoise_2.0.1              
[175] IRkernel_1.3.0.9000         dplyr_1.0.8                
[177] irlba_2.3.5                 future.apply_1.8.1   
timoast commented 2 years ago

I wasn't able to reproduce this issue, can you try updating to the latest Signac version on the develop branch and see if you still have the error?

timoast commented 2 years ago

Closing this now, please reopen if you're still having issues

carversh commented 2 years ago

Hi, I've updated Signac and am still running into the same issue. I'm using Signac version 1.6.0

carversh commented 2 years ago

Can you please reopen this issue?