JordanAsh / badge

An implementation of the BADGE batch active learning algorithm.
197 stars 32 forks source link

No pseudo/hypothetical labels founded in the code. #8

Closed SupeRuier closed 3 years ago

SupeRuier commented 3 years ago

Hi Jordan Ash,

In your paper, the gradient embedding is from the loss between the network output and the hypothetical labels (inferred from the network output).

However, in your code, I didn't find anything about pseudo/hypothetical labels.

In the file badge_sampling.py, it seems that you directly use the true labels to guide your selection. If so, this would be an unfair comparison.

gradEmbedding = self.get_grad_embedding(self.X[idxs_unlabeled], self.Y.numpy()[idxs_unlabeled]).numpy()

I'm not sure if I miss something. Could your show how you use the hypothetical labels in your code?

Thanks, Rui

SupeRuier commented 3 years ago

Sorry I found it here in file strategy.py.

batchProbs = F.softmax(cout, dim=1).data.cpu().numpy() maxInds = np.argmax(batchProbs,1)

Sorry for interrupting.