Closed orico closed 4 years ago
Thanks for posting! I believe, however, that the default tie-breaking policy in probs_to_preds
is "random" (https://github.com/snorkel-team/snorkel/blob/4c361335c43305fd3ba2991f40b243f76b863503/snorkel/utils/core.py#L14), so when you have tied probabilities, it will randomly select among the tied indices. For example:
> probs = np.ones((5,4)) * 0.25
> preds = probs_to_preds(probs)
> print(preds)
array([0, 3, 0, 3, 2])
Let us know if you see behavior otherwise!
TLDR, it will assign 0 if the probs are all equal. this functionality is misleading when calculating metrics.
from snorkel.utils import probs_to_preds
probs_dev = label_model.predict_proba(L_dev) print(probs_dev[:10]) preds_dev = probs_to_preds(probs_dev) print(preds_dev[:10])