elijahcole / single-positive-multi-label

Multi-Label Learning from Single Positive Labels - CVPR 2021
https://arxiv.org/abs/2106.09708
MIT License
91 stars 18 forks source link

Question about the label estimator initialization #5

Closed Correr-Zhou closed 3 years ago

Correr-Zhou commented 3 years ago

Hi, Elijah!

Great work! After reading the source code, I have a question about the label estimator initialization.

In your paper, you say that you initialize $\theta{ni}$ from the uniform distribution on $[\sigma^{-1}(0.4), \sigma^{-1}(0.6)]$ when $z{ni}$ is unobserved. However, the corresponding operation in models.py is as following:

# initialize unobserved labels:
w = 0.1
q = inverse_sigmoid(0.5 + w)
param_mtx = q * (2 * torch.rand(num_examples, P['num_classes']) - 1) 

I am not sure if I didn't understand this code, or if it was just a mistake?

Looking forward to your reply!

Correr Zhou 2021.8.3

elijahcole commented 3 years ago

Thanks for the question!

I think the code is correct, but there is a step for the lower endpoint that's not obvious.

The code initializes the entries of param_mtx to be drawn from Uniform(-q, q).

The upper endpoint is straightforward: q = inverse_sigmoid(0.5 + w).

The lower endpoint comes from applying log properties to the definition of the inverse sigmoid function: -q = - inverse_sigmoid(0.5 + w) = - log( (0.5 + w) / (1 - (0.5 + w)) ) = - log( (0.5 + w) / (0.5 - w) ) = log( (0.5 - w) / (0.5 + w) ) = inverse_sigmoid(0.5 - w).

Thanks for flagging this, I've made a note to clarify the code. And please re-open this issue if I didn't answer your question or if I got anything wrong!