YangLabHKUST / STitch3D

Construction of a 3D whole organism spatial atlas by joint modeling of multiple slices
https://stitch3d-tutorial.readthedocs.io/en/latest/index.html#
MIT License
52 stars 2 forks source link

Human heart dataset comparison with other methods #19

Open cristalliao opened 1 year ago

cristalliao commented 1 year ago

Dear Professor, I am very impressed with the STitch3D technology. I want to know how to do the model evaluation with different methods. I think the data processing part is different among different methods in Figure j and k. For example, I want to compare cell proportions results with the CARD method. I need to input the data in R and use the CARD package to do the analysis. I was wondering how to do this. I am not familiar with the model evaluation part. Could you show some code examples to do this? Thanks a lot!

Screen Shot 2023-08-09 at 10 29 17 pm
gefeiwang commented 1 year ago

Hi Cristal,

To evaluate methods in R, we need to first create file types that can be read in R. There are several ways to do this.

One way to do this is to first convert h5ad created in Python to h5seurat files, and then read the R object using "SeuratDisk" package:

library(Seurat)
library(SeuratDisk)

h5_data_path <- "./data"

Convert(paste0(h5_data_path,"/adata_ref.h5ad"), dest = "h5seurat", overwrite = FALSE)
obj_ref <- LoadH5Seurat(paste0(h5_data_path,"/adata_ref.h5seurat"))

Alternatively, you can directly using anndata package in R to read h5ad files like adata_ref <- anndata::read_h5ad('adata_ref.h5ad'). Besides, you can also directly create Seurat objects by reading raw count matrices and meta file.

After loading the data, you can follow tutorials of other packages to perform the analysis.

Best, Gefei

cristalliao commented 1 year ago

Dear Professor Geifei,

Thank you so much Geifei, also I have a problem with generating graphs j and k, do you know how to plot the cell proportion graph in R or Python, I really want to know how to plot this graph.

Also, I have encountered some problems in reading reference datasets since I found there are lots of NA variables. How to deal with this reference NA?

Screen Shot 2023-08-10 at 1 34 46 pm

Moreover, I am still confused about the "adata_st_list_raw" object, does it stand for the raw dataset for each slice 0-8?

Screen Shot 2023-08-10 at 1 49 44 pm

Thanks in advance!

Best regards, Cristal

gefeiwang commented 1 year ago

Hi Cristal,

For the cell-type proportion graphs you mentioned, we used pie plots for visualization. You can use matplotlib.axes.Axes.pie in python, or some equivalent functions in R to plot it.

In the reference dataset indeed there are some NAs. We only used rows whose indices are barcodes and can be found in the count matrix, which is included in the tutorial code: for col in meta_ref.columns[:-1]: adata_ref.obs[col] = meta_ref.loc[count_ref.index][col].values

For the last question, yes, it is a list of raw anndata objects for each slice.

Best, Gefei

cristalliao commented 1 year ago

Dear Professor Geifei,

Got it! Thanks for your explanation and assistance! Very appreciated!

Best regards, Cristal