YannDubs / Mini_Decodable_Information_Bottleneck

Minimum viable code for the Decodable Information Bottleneck paper. Pytorch Implementation.
MIT License
10 stars 2 forks source link

the difference between DIB and contrastive learning #1

Closed ShellingFord221 closed 2 years ago

ShellingFord221 commented 2 years ago

Hi, I just wonder that what is the difference between proposed DIB and conventional contrastive learning? It seems that they both make representations undistinguishable within the class and distinguishable between different classes. Thanks!

YannDubs commented 2 years ago

Hi, I think they are very different.

  1. DIB is supervised, while conventional contrastive learning (eg SimCLR) is self-supervised (although supervised variants exist: https://arxiv.org/abs/2004.11362 )
  2. Contrastive learning actually does not make representations indistinguishable within classes, it only ensures distinguishability between different classes. For example supervised contrastive learning is actually equivalent to standard cross-entropy (if the batch size is large enough), so it just maximizes I_v[Z->Y] but does not minimize I_v[Z->X].

The analog to IB in contrastive land would be something like https://arxiv.org/pdf/2109.12909.pdf . The difference with DIB is that DIB :

  1. is supervised (this is the difference between IB and the cited paper)
  2. takes into account the functional family which is provably what we need (this is the difference between DIB and IB).

hope that helps

ShellingFord221 commented 2 years ago

I suddenly come up with an idea that could DIB motivate augmentation? Since DIB finds representations with sufficiency and minimality, could augmentations based on these representations be more effective and informative? Thanks!