kaizhang / SnapATAC2

Single-cell epigenomics analysis tools
https://kzhang.org/SnapATAC2/
228 stars 26 forks source link

Extracting fragment count matrix #316

Open lf96abc opened 3 months ago

lf96abc commented 3 months ago

Hello,

Thanks for the great package!

I am trying to extract the fragment count matrix to implement with the following notebook:

https://github.com/aertslab/pycisTopic/blob/old/notebooks/Toy_melanoma-RTD.ipynb

However, I cannot see where to extract the fragment count matrix. I believe this is stored in adata.obsm['fragment_paired'], but I cannot see how to extract the fragment names.

Thank you for your help

kaizhang commented 3 months ago

adata.obsm['fragment_paired'] stores single base resolution count matrix in a compact format as described here: https://kzhang.org/SnapATAC2/api/_autosummary/snapatac2.pp.import_data.html#snapatac2.pp.import_data.

It is not easy to extract this matrix as it is huge. Instead, you can extract a low resolution version of it after calling pp.add_tile_matrix. If you do want to get a single base resolution count matrix, you can call pp.add_tile_matrix with bin_size=1. But this will likely run into out-of-memory.

yojetsharma commented 1 month ago

adata.obsm['fragment_paired'] stores single base resolution count matrix in a compact format as described here: https://kzhang.org/SnapATAC2/api/_autosummary/snapatac2.pp.import_data.html#snapatac2.pp.import_data.

It is not easy to extract this matrix as it is huge. Instead, you can extract a low resolution version of it after calling pp.add_tile_matrix. If you do want to get a single base resolution count matrix, you can call pp.add_tile_matrix with bin_size=1. But this will likely run into out-of-memory.

so is there any other way to get around this so that fragment generated and annotation done using snapatac2 can be used for pycisTopic?

kaizhang commented 1 month ago

What information exactly do you need?

yojetsharma commented 1 month ago

What information exactly do you need?

The starting input required in th pycisTopic tutorial are the fragment files and annotations from each cell. Since, both are already stored in the snapatac anndata, I was wondering if the fragments and cell_data could be extracted and used as an input for pycisTopic.