Closed YasushiMiyata closed 4 years ago
Some codes may be updated while creating #492. I'm now re-checking.
Merging #492 into master will not change coverage. The diff coverage is
71.42%
.
@@ Coverage Diff @@
## master #492 +/- ##
=======================================
Coverage 85.85% 85.85%
=======================================
Files 88 88
Lines 4568 4568
Branches 851 853 +2
=======================================
Hits 3922 3922
Misses 464 464
Partials 182 182
Flag | Coverage Δ | |
---|---|---|
#unittests | 85.85% <71.42%> (ø) |
Flags with carried forward coverage won't be shown. Click here to find out more.
Impacted Files | Coverage Δ | |
---|---|---|
...fonduer/candidates/models/implicit_span_mention.py | 81.96% <66.66%> (ø) |
|
src/fonduer/candidates/models/span_mention.py | 82.24% <66.66%> (ø) |
|
src/fonduer/candidates/matchers.py | 97.31% <100.00%> (ø) |
Something failure in installation of ubuntu. There would be nothing more I can.
Thanks for making this clear!
Description of the problems or issues
Is your pull request related to a problem? Please describe. A clear and concise description of what the problem is.
A sentence "123 456 789" is parsed and gets three words "123", "456", and "789". I'd like to match a number like
RegexMatchSpan(rgx=r"\d{9}", sep=" ")
but sep=" " has no effect
Does your pull request fix any issue. Fix #270
Description of the proposed changes
Enable RegexMatchSpan with sep="(separator)" option. It concatenates mention spans to one word and does RgexMatch without consideration of the separator.
Test plan
Add Test Code to 'fonduer/tests/candidates/test_matchers.py'. A sentence "This is apple" is parsed and gets 2 2-grams "This is" and "is apple". We can get "is apple" with following rgx and sep="(space)" option:
RegexMatchSpan(rgx=r"isapple", sep=" ")
Checklist