Closed brahmaneya closed 5 years ago
Merging #1502 into master will increase coverage by
<.01%
. The diff coverage is100%
.
@@ Coverage Diff @@
## master #1502 +/- ##
==========================================
+ Coverage 97.6% 97.61% +<.01%
==========================================
Files 55 55
Lines 2049 2056 +7
Branches 335 339 +4
==========================================
+ Hits 2000 2007 +7
Misses 22 22
Partials 27 27
Impacted Files | Coverage Δ | |
---|---|---|
snorkel/labeling/model/label_model.py | 95.92% <100%> (+0.09%) |
:arrow_up: |
Description of proposed changes
Use the Munkres algorithm instead of brute force search to find the optimal permutation for breaking column symmetry. Our objective for symmetry breaking is to maximize the total probability of LFs being accurate (summed over LFs). We optimize this over combinations of permutations of columns corresponding to labels with the same prior probability. This is equivalent to optimizing the trace of the product of the summed probabilities matrix with the permutation matrix, which can be done in O(n^3) time.
Fixes # (issue)
1486
Test plan
Modified existing symmetry breaking test.
Checklist
Need help on these? Just ask!
tox -e complex
and/ortox -e spark
if appropriate.