feiyoung / PRECAST

an efficient data integration method for multiple spatial transcriptomics data with non- cluster-relevant effects such as the complex batch effects.
GNU General Public License v3.0
9 stars 3 forks source link

A question about Domains #6

Closed spatialbiology1 closed 1 year ago

spatialbiology1 commented 1 year ago

Many thanks for providing such a great method for spatial transcriptomics. As I read the manuscript and, more specifically, the section on HCC, I felt that the domains across these four samples provided valuable insights into tumor biology. My query is more about deconvolution and domain results. What I infer for the results that  the aggregate cellular composition of each visium spot represents the cellular composition of each domain. Specifically, each and every visium point on a given domain will have the identical cellular composition ( that particular domain is larger unit and these visium spots are the building block of that domain) ? For instance, if Domain X is composed of 50% tumor, 20% CAF, 10% CAFs, 10% B cells, and 10% T cells on deconvolution results, this means that every visium spot in domain X has the same cellular composition i.e 50% tumor, 20% CAF, 10% CAFs, 10% B cells, and 10% T cells?

Thanks a lot, Zinn

feiyoung commented 1 year ago

Thank you for your attention to our work. The cellular composition of all visium spots in each domain provides an aggregate representation of the cellular composition at the population level. However, at the individual sample level, each visium spot within a given domain may have a different cellular composition due to the random sampling process. For example, Figure 5d (middle panel) shows the proportions of immune cells in each spot on a spatial heatmap, where spots within the same domain (e.g., domains 6 and 7) exhibit different proportions of immune cells. Even if we assume that the cellular compositions of spots within the same domain follow the same distribution, we can only infer that visium spots within the same domain have a same cellular composition at the population level, rather than at the individual sample level.

I don't understand the meaning that "particular domain is larger unit and these visium spots are the building block of that domain". Can you give more explanation?

spatialbiology1 commented 1 year ago

Thanks for the detailed explanation. I have started to understand the logic better. I am sorry if I was not clear enough and I think we may be on same page here. Lets take domain 4 for an example, this domain is shared across all 4 HCC samples and the cellular composition of every visium spot across population ( in this case 4 HCC samples) have same cellular composition and proportion?. I understand that each domain may be over or underrepresented in a samples but at the spot level the cellular composition will be the same ( For example domain 1 is mainly present in sample 1 HCC - Figure 5 E) More specifically if we look at each and every visium spot across all samples ( in this case 4 HCC) of domain 4 will have same composition of tumor, immune and stromal cell composition. Thanks again for your explanation.

feiyoung commented 1 year ago

In my opinion, for domain 4, the cellular composition of every visium spot is not necessary to have the same cellular composition and proportion. Because the spots in domain 4 have hetergeneity, the cellular composition of every visium spot in this domain is not equal to the aggregated cellular composition in this domain.

spatialbiology1 commented 1 year ago

Thank you so much for the clarification! I plan to carry out analysis primary colon and liver metastasis (N=25) . Have you worked primary and metastatic samples using PRECAST? Any guidance in this regard is very much appreciated. Thanks Zinn

feiyoung commented 1 year ago

I'm sorry that I did not work primary and metastatic samples using PRECAST. You could try it. There may be some noteworthy points to consider. Firstly, there are a lot of data batches, so it might be beneficial to extract more information in the embedding space. This can be achieved by increasing the number of factors (q), which is set to 15 by default. When using PRECAST on this data, you can experiment with different initialization parameters by setting int.model. In some cases, setting int.model=NULL may produce better results. For DLPFC Visium data, significant improvements have been observed by setting int.model='EEE'.

Hope this is helpful for you!