Closed HiromuHota closed 4 years ago
Merging #458 into master will increase coverage by
0.00%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #458 +/- ##
=======================================
Coverage 83.20% 83.20%
=======================================
Files 88 88
Lines 4554 4555 +1
Branches 837 837
=======================================
+ Hits 3789 3790 +1
Misses 572 572
Partials 193 193
Flag | Coverage Δ | |
---|---|---|
#unittests | 83.20% <100.00%> (+<0.01%) |
:arrow_up: |
Impacted Files | Coverage Δ | |
---|---|---|
src/fonduer/parser/visual_linker.py | 83.65% <100.00%> (+0.07%) |
:arrow_up: |
Description of the problems or issues
Is your pull request related to a problem? Please describe.
See #412.
Does your pull request fix any issue.
Fix #412.
Description of the proposed changes
A clear and concise description of what you propose.
Based on my findings reported at #412, the non-deterministic behaviour in
Featurization
boils down to a non-deterministic behaviour inVisualLinker
, which is caused by the fact that the order ofdoc.sentences
is not deterministic unlessorder_by=True
in thebackref
argument below. https://github.com/HazyResearch/fonduer/blob/246a92dd05884e3a706ba216e6177456d537de13/src/fonduer/parser/models/sentence.py#L248-L252 One trivial way to fix this issue is to addorder_by=True
to thebackref
, however, I just sortsentences
(by position) at the beginning ofVisualLinker.link
and I think this is good enough to fix #412.Test plan
A clear and concise description of how you test the new changes.
Ideally, I'd like to test if my change prevents #412 from happening. However, this is not feasible as the issue is non-deterministic and does not happen every test run. Instead, I'd test if the result of visual linker is affected by the order of sentences.
Checklist