Open ttgump opened 4 months ago
I think after preprocessing and peak calling, ATAC data can be mapped to small windows of regions. Here is a related paper that may help suggest a lot of tools https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1929-3#Sec6 . To clarify, all the ATAC-seq data we used in the experiments are processed public datasets after peak calling. @ChloeXWang would you like to comment more on this question?
I think after preprocessing and peak calling, ATAC data can be mapped to small windows of regions. Here is a related paper that may help suggest a lot of tools https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1929-3#Sec6 . To clarify, all the ATAC-seq data we used in the experiments are processed public datasets after peak calling. @ChloeXWang would you like to comment more on this question?
Yes, I understand that ATAC-seq data will be cell by peak matrix. My question is that the counts of cell by peak matrix are binary values. There are only 0 and 1 values. Not like RNA-seq counts, you can assign counts into many bins (I think the default setting is 51 bins). How can we assign binary ATAC-seq counts into bins?
I see, the ATAC datasets we have been working with are not binary. Would you be able to access non-binary peak count matrices? Or would you mind elaborating a bit more on your usage scenario (e.g, cluster, integration)? We can see if there is any binning recommendations for your problem at hand.
Yes, we can access to the raw count matrix of the ATAC-seq data. Most reads are 0, 1, 2, do you have any suggestion of binning?
Hi, I have a question of using scGPT on ATAC data. The typical scATAC-seq has binarized values 0 and 1, so how to make the ATAC counts as bins? Should we only make the counts to only 2 bins: 0-bin and 1bin? Thanks.