cistrome / MIRA

Python package for analysis of multiomic single cell RNA-seq and ATAC-seq.
52 stars 7 forks source link

suggested upsteam processing before mira #20

Open KailiBio opened 1 year ago

KailiBio commented 1 year ago

Hi,

I want to try out MIRA on some new datasets. While it is mentioned using expression and accessibility matrices as input, do you have any suggested tools/pipeline/workflow for generating? I am trying to convert seurat obj to h5ad file to run mira, but apparently, the format is not quiet matching. Could you give some suggestion on that?

Thanks, Kaili

AllenWLynch commented 1 year ago

Hi Kaili,

I have usually worked with the output from CellRanger, which makes 10X mtx files, which can then be converted to h5ad. From there, the only preprocessing this is needed is cell QC and filtering which is pretty easy to do with interactive tools like seurat and scanpy.

I have not previously tried converting between seurat and h5ad. Is there an option to write seurat objects in the 10X file format?

AL

KailiBio commented 1 year ago

Thanks Allen!

I have figured that out. So from the processed seurat object to h5ad, the conversation automatically chooses the normalized matrix instead of the raw count matrix, which causes the error.

BTW, do you know how much n_work and GPU time it took for you to tune and train the topic model? It took me more than 2 days to just tunning the parameters. I wonder what is the general run time for using MIRA to process a 10X multi-omic/SHARE-seq dataset from beginning to end.

Thanks, Kaili