Question about the calculation of the prior distribution of z

azizilab / starfysh

Spatial Transcriptomic Analysis using Reference-Free auxiliarY deep generative modeling and Shared Histology

BSD 3-Clause "New" or "Revised" License

97 stars 12 forks source link

Question about the calculation of the prior distribution of z #56

Open Enderlogic opened 1 week ago

Enderlogic commented 1 week ago

Hi,

Thank you for maintaining this great package. I noticed that in your code, you calculate the mean and variance of the prior distribution of z, i.e., p(z|c,u) by the posterior rather than the prior of c and u. Is this a bug or a feature of your model? If it is the latter, could you please clarify its motivation? Thank you.

YinuoJin commented 1 week ago

Hi @Enderlogic

Thanks for reaching out. To answer your question in high level, yes we need to do Monte Carlo sampling to get q(u) & q(c). The main reason is that our model is a hierarchical VAE, so we need to do so for maintaining the structure between {u, c} - {z}. Please refer to the ELBO equation on page 15 of our paper (you'll notice the difference between KL divergence terms).

Similar examples could be found in other laddar VAE literatures: e.g. equation (1) here & equation (6) here

Enderlogic commented 1 week ago

Thank you for your explanation and the relevant papers. If I understand correctly, both u and σ in the prior of z are some sort of unknown "hyperparameters" that need to be learned and updated during training. If so, why do you only assign u a prior but not σ?

YinuoJin commented 4 days ago

Hi, thanks for the follow up. Following the plate model of Starfysh, u and $\sigma$ both represent cell-state information: we define u to represent low-dim cell-state mechanics and $\sigma$ for allowing cell-state specific heterogeneity.

Commonly u is more important than $\sigma$ for differentiating one cell type/state from another. Therefore we designed u as a "random variable" and $\sigma$ as a global parameter that could be learnt during optimization. None of them are hyperparameters, which should be fixed quantities for the model. Of course you can always design additional priors and "hyper-priors" for random variables, but in Starfysh we opted to have a balance between setting priors and learning from the data. Hope it helps!