YMa-lab / CARD

GNU General Public License v3.0
81 stars 20 forks source link

interpret this result of "CT_*" label using CARDfree method #9

Closed hzongyao closed 2 years ago

hzongyao commented 2 years ago

I am using the CARDfree method for analysis and would like to ask if there is any correspondence between the "CT_*" label obtained in the analysis results and the cell type initially used as data input? Any suggestions on how to interpret this result?

YingMa0107 commented 2 years ago

Hi @hzongyao,

Thank you for your interest in our package!

For your question, there is no correspondence between the "CT_" label obtained in the analysis results and the initial cell type marker genes since all the provided marker genes are jointly analyzed and there is no appropriate method to differentiate them. For the interpretation, one of the exploratory analysis we have done to link the cell types of the marker genes and the CT labels is as the following (You can also see supplementary figure 83 in the paper):

For each "CT" label, i.e. CT14 here in the supplementary Figure 83, we first divide the spots into CT14 enriched and CT14 non-enriched locations, and the CT14 enriched spatial locations were defined as the spatial locations that contains at least 50% of CT14 cell type (You can also define it as the median or > 50%). And then for each set of cell type specific marker genes, we calculate the mean gene expression of each set of marker genes in the CT14 enriched locations, the cell type corresponds to the highest set of mean gene expression can be assumed as the cell type annotation for the CT14 label. And then we iterate the above procedure for each "CT" label.

Another thought is that you can also perform the DE analysis between the spatial locations enriched and non-enriched of CT14 cells and identify the DE genes, and then try to find some biological insights from the DE genes to annotate the CT labels.

However, we have also mentioned in the discussion that it is not easy to tell what the "CT_" label accurately and correctly through such a post processing step especially when there are markers for different cell types but all enriched in locations with a high proportion of of the same "CT" labeled cells. And it might also depend on the tissue and the underlying cell type proportion distribution in dataset. Hope this helps!