ATAC data of Paired-seq looks strange using Signac

Telogen commented 2 years ago

Hi, Dr. Zhu, I'm processing ATAC data of Paired-seq using Signac

counts <- Read10X("./data/07.Paired-seq_DNA_filtered_matrix")
chrom_assay <- CreateChromatinAssay(counts,sep = c(":", "-"))
pbmc.ATAC <- CreateSeuratObject(chrom_assay,assay = "ATAC")
pbmc.ATAC <- RunTFIDF(pbmc.ATAC)
pbmc.ATAC <- FindTopFeatures(pbmc.ATAC, min.cutoff = 'q0')
pbmc.ATAC <- RunSVD(pbmc.ATAC)
DepthCor(pbmc.ATAC,n = 50)
pbmc.ATAC <- RunUMAP(pbmc.ATAC, dims = 2:50, reduction = 'lsi',verbose = F)

metadata <- read.csv('./data/01.Paired-Tag_seq_RNA_filtered_matrix/Nuclei_metaData.csv')
rownames(metadata) <- metadata$Cell_ID
metadata_ATAC <- metadata[colnames(pbmc.ATAC),]
pbmc.ATAC$true <- metadata_ATAC$Annotation

DimPlot(object = pbmc.ATAC, label = T,group.by = 'true') + NoLegend()

I got this UMAP as the output The UMAP looks strange, could you please give me some suggestions on how to deal with the data properly? Many thanks!

cxzhu commented 2 years ago

Hi @Telogen,

I can recover most of the major cell types consistent with RNA-based clustering with Signac by fine-tuning some parameters (please find the attached file). Untitled.html.zip

By the way, I strongly recommend you to try SnapATAC2 for processing of chromatin data, as in our hand the SnapATAC package showed a better cell type separation and speed in processing larger datasets.

Best, Chenxu

Telogen commented 2 years ago

Thanks Dr. Zhu! I didn't expect TF-IDF method to have such a big impact on the results, and I'm very willing to have a try of SnapATAC2!

cxzhu / Paired-Tag

ATAC data of Paired-seq looks strange using Signac #12