Closed lfashby closed 4 years ago
Merging #49 into master will not change coverage. The diff coverage is
100%
.
@@ Coverage Diff @@
## master #49 +/- ##
=====================================
Coverage 100% 100%
=====================================
Files 12 12
Lines 393 394 +1
=====================================
+ Hits 393 394 +1
Impacted Files | Coverage Δ | |
---|---|---|
src/segments/tokenizer.py | 100% <ø> (ø) |
:arrow_up: |
tests/test_tokenizer.py | 100% <100%> (ø) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update b76d562...52a4843. Read the comment docs.
I'm inclined to merge this, since this treatment would be consistent with what we did in #46. But after merging #46 I started thinking whether raising an error isn't a more appropriate default? What do you think @bambooforest ? Or should we add a strict
flag to trigger the old behaviour?
We (as downstream users) can deal with either. That said, I think we'd weakly prefer strings like this and the one in #46 to to get identity-transformed, since otherwise we have to catch the error and then force that interpretation.
Having had to deal with these errors a lot, I'm inclined to agree with @kylebgorman and identity-transform this type of input @xrotwang (also as you mention because it follows #46 ). But for the user, this might implicitly suggest that we do something sensible with ipa=True
, as in check that it's valid IPA input (whatever that is ;) , so I agree a strict
flag would be nice so that it fails when some set of characters are introduced that aren't IPA -- but which ones?
https://github.com/unicode-cookbook/cookbook/tree/master/book/tables
Strict? Valid? Either or? Then we come around to user-defined OPs.
Thinking a bit more, I'd say identity transform is the only reasonable thing to do. After all, we don't do any checking whether the string we add modifiers to is anything sensible - other than not empty ...
Steven Moran notifications@github.com schrieb am Di., 19. Nov. 2019, 18:08:
Having had to deal with these errors a lot, I'm inclined to agree with @kylebgorman https://github.com/kylebgorman and identity-transform this type of input @xrotwang https://github.com/xrotwang (also as you mention because it follows #46 https://github.com/cldf/segments/issues/46 ). But for the user, this might implicitly suggest that we do something sensible with ipa=True, as in check that it's valid IPA input (whatever that is ;) , so I agree a strict flag would be nice so that it fails when some set of characters are introduced that aren't IPA -- but which ones?
https://github.com/unicode-cookbook/cookbook/tree/master/book/tables
Strict? Valid? Either or? Then we come around to user-defined OPs.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cldf/segments/pull/49?email_source=notifications&email_token=AAGUOKCDVUMPVOFNJHR2YJ3QUQMPLA5CNFSM4JO47LRKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEO6SKQ#issuecomment-555608362, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGUOKAHTIGZ7O35MKGPWXDQUQMPLANCNFSM4JO47LRA .
@lfashby @kylebgorman just released v2.1.3, including this (and the previous) PR: https://pypi.org/project/segments/
TY.
On Thu, Nov 21, 2019 at 9:31 AM Robert Forkel notifications@github.com wrote:
@lfashby https://github.com/lfashby @kylebgorman https://github.com/kylebgorman just released v2.1.3, including this (and the previous) PR: https://pypi.org/project/segments/
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cldf/segments/pull/49?email_source=notifications&email_token=AABG4OPD4NTQHZHITWEDETDQU2LVZA5CNFSM4JO47LRKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE2NK4A#issuecomment-557110640, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABG4OLKBNYS63C64XNDDSDQU2LVZANCNFSM4JO47LRA .
Similar to #46:
Will throw an IndexError. This change will skip trying to combine
"ʲ"
with an emptyresult
list by checking ifgraphemes
is longer than one character.