Minimizing this loss function will minimize or maximize mutual information?

RElbers / info-nce-pytorch

PyTorch implementation of the InfoNCE loss for self-supervised learning.

MIT License

445 stars 37 forks source link

Minimizing this loss function will minimize or maximize mutual information? #8

Closed geolvr closed 1 year ago

geolvr commented 2 years ago

I'm confused about this. NCEloss can determine the lower bound of mutual information. In this implementation, should NCEloss be minimized in order to increase mutual information?

RElbers commented 2 years ago

A lower NCE loss means a higher lower bound of the mutual information. So minimizing the NCE loss will maximize the (lower bound of the) mutual information.

geolvr commented 1 year ago

In my actual pytorch training, I take NCEloss as part of my loss function and observe the mutual information value calculated by sklearn's API (sklearn.metrics.mutual_info_score) during the training process. However, I found that as the NCE loss decreases, the mutual information also decreases. I can't think of a reason to explain this.

geolvr commented 1 year ago

I think I figured it out. This method provided by sklearn cannot be used to calculate mutual information between continuous variables, thus leading to wrong results.