labomics / midas

MIT License
41 stars 5 forks source link

How to apply MIDAS to 10x format data? #1

Closed yinleHu closed 2 months ago

yinleHu commented 8 months ago

Dear Prof. Ying

  Recently, I had the pleasure of reading your research work titled 'Mosaic integration and knowledge transfer of single-cell multimodal data with MIDAS'. The novelty and effectiveness of this work deeply attracted me. I've started trying to use MIDAS to process my mosaic data. However, after reading your tutorial, I still don't understand how to use MIDAS to analyze my data. My data are in common file formats; for example, RNA and ATAC data are in three files at 10x: barcodes.tsv, features.tsv, and matrix.mtx. Protein abundance data is in a CSV file. Can you provide a tutorial that matches this type of file format? Or perhaps a tutorial compatible with Seurat or Scanpy?"
wangjing-bio commented 7 months ago

In fact, regardless of the format of the ATAC data, RNA data, or ADT data, all you need to do is first create the task you want to train in the data.toml file. Then, after performing quality control on the data involved in the task and saving it in .h5seurat format, you can proceed to obtain the required data format for training using the ' preprocess/combine_subsets.R' and ' preprocess/split_mat.py' scripts. Once you have the data in the desired format, you can use the 'run.py' code for training. For more details, please refer to our instructions (https://sc-midas-docs.readthedocs.io/en/latest/).