Closed marcriera closed 2 years ago
So the issue here is that I don't think there's any code currently that makes a distinction between +
as a normal character and +
as a compound separator (same issue in https://github.com/apertium/apertium/issues/171).
The solution should probably be to only treat +
as a compound separator if the preceding symbol is a tag.
That doesn't cover the case where the second element of a compound has a lemma beginning with +
in which case ... please don't.
On second thought, lemma beginning with a +
would be ok because then the compound would have ++
.
Or, in this specific case, we can do the even dumber thing and when we see a +
run the code paths for both word boundaries and normal symbols, which appears to work fine (no regressions on oci-fra).
It works now without issues. Thanks for the quick fix!
lt-trim
seems to trim valid analyses containing+
.For the text
I+D
,spa.automorf.bin
correctly returns^I+D/I+D<n><acr><f><sg>$
. However,spa-cat.automorf.bin
doesn't return any analysis despite the bidix containing a valid entry.I've tried expanding the pair analysis transducer and the entry seems to be missing. Running the pair pipeline with
spa.automorf.bin
instead of the trimmed results in a valid translation.