vanheeringen-lab / ANANSE

Prediction of key transcription factors in cell fate determination using enhancer networks. See full ANANSE documentation for detailed installation instructions and usage examples.
http://anansepy.readthedocs.io
MIT License
77 stars 16 forks source link

Usage of ANANSE in single-cell multiomic data #199

Open PauBadiaM opened 1 year ago

PauBadiaM commented 1 year ago

Hi developers,

Nice method and code/documentation! I was wondering how feasible is to apply ANANSE to single-cell multiomics data (RNA+ATAC):

Thank you for your time!

simonvh commented 1 year ago

It is possible, but not yet completely out-of-the-box. @JGASmits, @Arts-of-coding and/or @siebrenf may be able to help out?

Arts-of-coding commented 1 year ago

Hi @PauBadiaM, We are currently in the process of implementing single-cell (multiomics) data into ANANSE. For Python this is already available: https://github.com/Arts-of-coding/AnanseScanpy. I have a vignette specifying how you can go from two separate scanpy objects: one containing expression data from scRNA-seq and one containing a cell-by-peak matrix from scATAC-seq into output data for ANANSNAKE (https://github.com/vanheeringen-lab/anansnake). ANANSNAKE is an automated pipeline that runs ANANSE, for instance based on the output files from AnanseScanpy of AnanseSeurat (https://github.com/JGASmits/AnanseSeurat).

If you want to only use the Python to generate the cell-by-peak matrix in scATAC-seq (required for AnanseScanpy), I recommend using the "pp.make_peak_matrix" function from snapatac2 (https://pypi.org/project/snapatac2/).

If you are in no rush, there will be an extended manual about this available soon, which will be mentioned on the pages of AnanseScanpy and AnanseSeurat when it is available.

If "MuData" will replace the currently extensively used "anndata" for single-cell objects in Python, it is likely to be implemented at a later stage.

I hope to have informed you sufficiently!

PauBadiaM commented 1 year ago

Hi @simonvh and @Arts-of-coding ,

Thanks for the replies! It is really nice that it can be used using AnnData. I proposed MuData because its the extension of AnnData to multiple omics in the same object (with nice behaviors like propagating filtering changes across omics), but in the end is a dictionary of AnnDatas objects which could be passed to Anansescanpy without a problem. Just a minor comment, why is this being implemented as a separate package? Shouldn't it be utility functions of ANANSE? In any case, it would be good that once finished you would point users to the Anansescanpy or AnanseSeurat vignettes in the main ANANSE documentation.

I'm in no rush for now, I'll wait ;)