stuart-lab / signac

R toolkit for the analysis of single-cell chromatin data
https://stuartlab.org/signac/
Other
323 stars 87 forks source link

regionmatrix very slow #1675

Closed yamihn closed 6 months ago

yamihn commented 6 months ago

hi, I would to counts the number of reads for cells in all the peaks of my dataset, to filter out peaks poorly enriched in my subset of clusters. However, when I gave as input all the peaks of the matrix, it take so long and I'm not sure if it's stuck at certain point. the code is:

DefaultAssay(obj) <- "ATAC"
 peaks=granges(obj) ## more than 170.000
 obj_try = RegionMatrix(obj, regions = peaks, assay="ATAC", group.by = "Subset_SHF", key = "P1", upstream = 0, downstream = 0, verbose = TRUE)
timoast commented 6 months ago

This is not the intended use of the RegionMatrix function, it's meant to look at a small number of sites similar to what the TF footprinting functions do.

I think what you want is to create a count matrix for your peaks and then filter out peaks with less than a certain number of counts in a cluster of cells. To create a count matrix you can use the FeatureMatrix function, although you probably already have this count matrix