stuart-lab / signac

R toolkit for the analysis of single-cell chromatin data
https://stuartlab.org/signac/
Other
317 stars 85 forks source link

Issues with Integrating scATAC and scRNA seq Datasets #5809 #1063 Reopening Issue #1082

Closed carversh closed 2 years ago

carversh commented 2 years ago

Hi, I am reopening the issue located here: https://github.com/timoast/signac/issues/1063

It appears even with the newest version of Signac I still get the same issue. I am not sure if it's because the gene names of the rna seq I am importing do not match the reference. If that's a possibility is there any way you could help me check and fix this feature?

carversh commented 2 years ago

I tried this command and still getting the same error:

gene.activities <- GeneActivity(morab.atac)

Error in names(cell.convert) <- cells: attempt to set an attribute on NULL Traceback:

  1. GeneActivity(morab.atac)
  2. FeatureMatrix(fragments = frags, features = transcripts, cells = cells, . verbose = verbose, ...)
  3. sapply(X = obj.use, FUN = function(x) { . SingleFeatureMatrix(fragment = fragments[[x]], features = features, . cells = cells, sep = sep, verbose = verbose, process_n = process_n) . })
  4. lapply(X = X, FUN = FUN, ...)
  5. FUN(X[[i]], ...)
  6. SingleFeatureMatrix(fragment = fragments[[x]], features = features, . cells = cells, sep = sep, verbose = verbose, process_n = process_n)
timoast commented 2 years ago

Did you save the object using SeuratDisk? There is a known issue with SeuratDisk that causes the same error message: https://github.com/timoast/signac/issues/1071

carversh commented 2 years ago

I am now running into a new issue -- perhaps because I updated Signac.

When I run this:

library('Signac')
library('Seurat')
library('Matrix')
library("data.table")
library('SeuratDisk')
library('ggplot2')
library('bcbioRNASeq')
library('GenomicRanges')
library('EnsDb.Hsapiens.v86')
library('BSgenome.Hsapiens.UCSC.hg38')
#seqlevelsStyle(annotation) <- "UCSC"
# get gene annotations for hg38
annotation <- GetGRangesFromEnsDb(ensdb = EnsDb.Hsapiens.v86)
seqlevelsStyle(annotation) <- "UCSC"

# reading in processed atac matrix
m <- readMM(file='matrix.mtx')

#m <- as(object = m, Class = "dgCMatrix")
m <- m*1

# creating barcode and feature labels
barcodes <- fread(file='barcodes_atac.csv', header=F)[[1]]
features <- fread(file='peaks.csv', header=F)[[1]]

# adding these feature and barcode names to the matrix
rownames(m) <- features
colnames(m) <- barcodes

# Define GRanges object using the features
granges <- StringToGRanges(features, sep = c(":", "-"))
granges <- granges[as.vector(seqnames(granges) %in% standardChromosomes(granges)),]

# Create Seurat Chromatin Assay
chrom_assay <- CreateChromatinAssay(
  counts = m,
  sep = c(":", "-"),
  ranges = granges,
  genome = 'hg38',
  fragments = 'fragments.tsv.gz',
  min.cells = 0,
  min.features = 0,
  annotation = annotation
)

# read in metadata
metadata <- read.csv(
  file = "snATAC_metadta.csv",
  header = TRUE,
  row.names = 1
)

morab.atac <- CreateSeuratObject(
  counts = chrom_assay,
  assay = "peaks",
  meta.data = metadata
)

I am getting this error:

Error in SetAssayData.ChromatinAssay(object = new.assay, slot = "annotation", : Annotation genome does not match genome of the object Traceback:

  1. CreateChromatinAssay(counts = m, sep = c(":", "-"), ranges = granges, . genome = "hg38", fragments = "/n/scratch3/users/s/shc989/fragments.tsv.gz", . min.cells = 0, min.features = 0, annotation = annotation)
  2. as.ChromatinAssay(x = seurat.assay, ranges = ranges, seqinfo = genome, . motifs = motifs, fragments = frags, annotation = annotation, . bias = bias, positionEnrichment = positionEnrichment)
  3. as.ChromatinAssay.Assay(x = seurat.assay, ranges = ranges, seqinfo = genome, . motifs = motifs, fragments = frags, annotation = annotation, . bias = bias, positionEnrichment = positionEnrichment)
  4. SetAssayData(object = new.assay, slot = "annotation", new.data = annotation)
  5. SetAssayData.ChromatinAssay(object = new.assay, slot = "annotation", . new.data = annotation)
  6. stop("Annotation genome does not match genome of the object")
timoast commented 2 years ago

Did you run the line seqlevelsStyle(annotation) <- "UCSC"?

Can you show the output of genome(annotation)?

carversh commented 2 years ago

I did run the UCSC line -- I can try to comment that out.

genome(annotation)

chrX'hg38'chr20'hg38'chr1'hg38'chr6'hg38'chr3'hg38'chr7'hg38'chr12'hg38'chr11'hg38'chr4'hg38'chr17'hg38'chr2'hg38'chr16'hg38'chr8'hg38'chr19'hg38'chr9'hg38'chr13'hg38'chr14'hg38'chr5'hg38'chr22'hg38'chr10'hg38'chrY'hg38'chr18'hg38'chr15'hg38'chr21'hg38'chrM'hg38'

carversh commented 2 years ago

How do I check the genome version of my fragments.tsv.gz file?

carversh commented 2 years ago

However, I'm pretty sure it's hg38

carversh commented 2 years ago
annotation <- GetGRangesFromEnsDb(ensdb = EnsDb.Hsapiens.v86)

# reading in processed atac matrix
m <- readMM(file='/n/scratch3/users/s/shc989/matrix.mtx')

## getting the following error: Error in as(object = m, Class = "dgCMatrix"): no method or default for coercing “ngTMatrix” to “dgCMatrix”
#suggested via this link to run the command commented out but that command didn't work so trying alternative command: https://github.com/timoast/signac/issues/309
#m <- as(object = m, Class = "dgCMatrix")
m <- m*1

# creating barcode and feature labels
barcodes <- fread(file='/n/scratch3/users/s/shc989/barcodes_atac.csv', header=F)[[1]]
features <- fread(file='/n/scratch3/users/s/shc989/peaks.csv', header=F)[[1]]

# adding these feature and barcode names to the matrix
rownames(m) <- features
colnames(m) <- barcodes

# Define GRanges object using the features
granges <- StringToGRanges(features, sep = c(":", "-"))
granges <- granges[as.vector(seqnames(granges) %in% standardChromosomes(granges)),]

# Create Seurat Chromatin Assay
chrom_assay <- CreateChromatinAssay(
  counts = m,
  sep = c(":", "-"),
  ranges = granges,
  genome = 'hg38',
  fragments = '/n/scratch3/users/s/shc989/fragments.tsv.gz',
  min.cells = 0,
  min.features = 0,
  annotation = annotation
)

Running this I still get this error: Error in SetAssayData.ChromatinAssay(object = new.assay, slot = "annotation", : Annotation genome does not match genome of the object Traceback:

  1. CreateChromatinAssay(counts = m, sep = c(":", "-"), ranges = granges, . genome = "hg38", fragments = "/n/scratch3/users/s/shc989/fragments.tsv.gz", . min.cells = 0, min.features = 0, annotation = annotation)
  2. as.ChromatinAssay(x = seurat.assay, ranges = ranges, seqinfo = genome, . motifs = motifs, fragments = frags, annotation = annotation, . bias = bias, positionEnrichment = positionEnrichment)
  3. as.ChromatinAssay.Assay(x = seurat.assay, ranges = ranges, seqinfo = genome, . motifs = motifs, fragments = frags, annotation = annotation, . bias = bias, positionEnrichment = positionEnrichment)
  4. SetAssayData(object = new.assay, slot = "annotation", new.data = annotation)
  5. SetAssayData.ChromatinAssay(object = new.assay, slot = "annotation", . new.data = annotation)
  6. stop("Annotation genome does not match genome of the object")
carversh commented 2 years ago

it seems to run when I comment out the genome line... is there a reason for that?

timoast commented 2 years ago

The reason for the error is that if you set the genome parameter, we check that the genome of the supplied annotations matches the genome name that you set. If you don't set the genome name, then you won't see the error. I don't know why you'd see this error if you're setting genome="hg38" and genome(annotations) returns hg38, but I would just proceed without setting the genome parameter.

I'm assuming then that the reason for the original error was saving objects using SeuratDisk, so I'll close this again