question about Information flow

Leafaeolian commented 1 month ago

hi author, thank for your excellent work. i was confused when i reading the code where s_enc also contribute information flow to biological state c. Isnt biological state only accept information flow from observation count（while technical noise receive from both side）？

zhen-he commented 1 month ago

Hello,

Yes. $s$ also contributes the information flow, as it helps disentangle the technical noise $u$ from the biological state $c$. Note that although both $x$ and $s$ are used to generate $c$, we can still make $s$ and $c$ independent by minimizing $I(s; c)$. In other words, $s$ can provide additional information for generating $c$, even when $I(s; c) = 0$, which is known as synergy: $I(x, s; c) > I(x; c) + I(s; c)$.

Intuitively, if you tell the encoder which batch $x$ is from, it can more effectively learn to remove the batch effect based on the batch ID.

Leafaeolian commented 1 month ago

Hello,

Yes. s also contributes the information flow, as it helps disentangle the technical noise u from the biological state c . Note that although both x and s are used to generate c , we can still make s and c independent by minimizing I ( s ; c ) . In other words, s can provide additional information for generating c , even when I ( s ; c ) = 0 , which is known as synergy: I ( x , s ; c ) > I ( x ; c ) + I ( s ; c ) .

Intuitively, if you tell the encoder which batch x is from, it can more effectively learn to remove the batch effect based on the batch ID.

Based on my superficial understanding，It seems like that expert of s tell other experts "this is batch id, so discard signal like this". However, the driven force is always the I(s;c) no matter how the message flow is transmitted according to what you say. So, this flow is 99% intuitive design or there is a ablation study to demonstate the synergy stategy? Very thank for your reply!

zhen-he commented 1 month ago

In fact, we previously attempted to obtain $c$ without using $s$ as input because we hoped that the neural network could directly remove batch effects from $x$. However, this approach did not perform well.

Theoretically, in some cases, it is impossible to remove batch effects using only $x$ without $s$. As illustrated, the cell observations $x_n$ from Batch 1 and Batch 2 may be identical, and only with the provision of $s_n$ can the true position in the distribution be inferred.

Moreover, to allow the model to better remove batch effects from $x$ and improve its generalization performance, we also adopted the idea of self-supervised learning. During training, we randomly mask $s$ with a probability of 0.1 (https://github.com/labomics/midas/blob/main/src/scmidas/models.py#L77).

Leafaeolian commented 1 month ago

In fact, we previously attempted to obtain c without using s as input because we hoped that the neural network could directly remove batch effects from x . However, this approach did not perform well.

Theoretically, in some cases, it is impossible to remove batch effects using only x without s . As illustrated, the cell observations x n from Batch 1 and Batch 2 may be identical, and only with the provision of s n can the true position in the distribution be inferred.

Moreover, to allow the model to better remove batch effects from x and improve its generalization performance, we also adopted the idea of self-supervised learning. During training, we randomly mask s with a probability of 0.1 (https://github.com/labomics/midas/blob/main/src/scmidas/models.py#L77).

Sounds like batch effect vary across batchs, so s is used to help distingush what extend batch effect should to be removed from this sample.

Good stategy and illustration! This work really inspired me!

btw, i have another question. it seems that the papar only focus multi-omics integration analysis, have you test the performance of cross-mod inference(one mod as input, and test the quality of inputed another mod using metrics like mse)?

zhen-he commented 1 month ago

Yes, please refer to the modality alignment metrics in Fig. 4c, where the ATAC AUROC, RNA Pearson’s r, and ADT Pearson's r are related to cross-modal inference (feature space), and are detailed in the "Modality alignment metrics" section.

Leafaeolian commented 1 month ago

Great! Thank you so much!

labomics / midas

question about Information flow #12