Closed HiromuHota closed 4 years ago
I think I fixed the issue.
Merging #429 into master will decrease coverage by
0.15%
. The diff coverage is75.60%
.
@@ Coverage Diff @@
## master #429 +/- ##
==========================================
- Coverage 82.59% 82.44% -0.16%
==========================================
Files 86 86
Lines 4366 4385 +19
Branches 810 812 +2
==========================================
+ Hits 3606 3615 +9
- Misses 572 582 +10
Partials 188 188
Flag | Coverage Δ | |
---|---|---|
#unittests | 82.44% <75.60%> (-0.16%) |
:arrow_down: |
Impacted Files | Coverage Δ | |
---|---|---|
src/fonduer/candidates/models/span_mention.py | 74.76% <60.00%> (-0.73%) |
:arrow_down: |
src/fonduer/parser/models/sentence.py | 93.38% <60.00%> (-1.44%) |
:arrow_down: |
src/fonduer/utils/utils_visual.py | 58.33% <71.42%> (-6.67%) |
:arrow_down: |
src/fonduer/utils/data_model_utils/visual.py | 88.23% <77.77%> (ø) |
|
src/fonduer/parser/visual_linker.py | 83.57% <100.00%> (+0.07%) |
:arrow_up: |
src/fonduer/utils/visualizer.py | 79.16% <100.00%> (ø) |
Thanks for this PR!
What do you think about having a fonduer.typing
module to store all Fonduer specific typings? Bbox
is a good example here.
Having a fonduer.typing
module is a good idea.
IMO, this will have type aliases to make codes more readable.
For example, the following type hints are very lengthy and hard to read.
By defining type aliases like below:
Alias1 = List[Tuple[Tuple[int, int], str]]
Alias2 = OrderedDict[Tuple[str, int], Tuple[int, int]]
This could become
self.pdf_word_list: Optional[Alias1] = None
self.html_word_list: Optional[Alias1] = None
self.links: Optional[Alias2] = None
However, Bbox
is not an alias, hence IMO it is not suited to be placed fonduer.typing
.
Moreover, I'll be adding methods to Bbox
like Bbox.horz_aligned
(superseding bbox_horz_aligned
) and Bbox.vert_aligned
(superseding bbox_vert_aligned
), which makes Bbox
less suitable in fonduer.typing
.
A good example would be: alias for Throtter
with
Throttler=Callable[[Tuple[Mention, ...]], bool]
to
throttlers: Optional[List[Throttler]] = None,
This PR has two benefits:
Tuple[int, int, int, int, int]
toBbox
.bbox_from_sentence
andbbox_from_span
. Just usesentence.get_bbox()
andspan.get_bbox()
. This is actually the benefit from using OOP.P.S. There are many more spots where OOP (object-oriented programming) is more suited.