theislab / chemCPA

Code for "Predicting Cellular Responses to Novel Drug Perturbations at a Single-Cell Resolution", NeurIPS 2022.
https://arxiv.org/abs/2204.13545
MIT License
88 stars 23 forks source link

the uncertainty equation #154

Closed bhomass closed 5 months ago

bhomass commented 7 months ago

agree with the entropy component, which says if the nearest neighbors are of a different pathway.

disagree with the distance component. low distance increases uncertainty if neighbors are of a different pathway, but would be lower uncertainty if the same pathway. But the inverse log distance term increases the overall uncertainty score regardless of the nature of the nearest neighbor.

MxMstrmn commented 5 months ago

Hi @bhomass,

What you are saying would simply reinforce the entropy effect. What we wanted to express through the distance is that the perturbation has a sufficiently "unique" embedding in space. This combined with the entropy can be a signal for good embedding quality. On the other hand, if the embedding is unique and the entropy high, the model would still be unsure. We formulated it this way as the most likely case is that the model predicts "no perturbation" resulting in compounds from different pathways being embedded closely together.

There are definitely options to formulate this differently :) this is just how we did it