Closed ClementChouteau closed 3 years ago
Thanks for spotting this.
@guillaumekln There is a test small_sentence_matches
where we check a single token match.
The caller sets match_length
to 1.
min_exact_match
is always less or equal to 1 when p_length == 1
, see
unsigned compute_min_exact_match(float fuzzy, unsigned p_length)
{
const auto differences = (unsigned)std::ceil(p_length * (1.f - fuzzy));
// we split (p_length - differences) in (differences + 1) parts
// the minimum value of the largest part size is obtained by dividing and taking ceil
return std::ceil((p_length - differences) / (differences + 1.));
}
_min_seq_len
is at least 1.
Therefore we never do the "lazy injection feature" return, and the behavior is the same.
When the pattern size is 1, the previous code bypassed these length checks:
https://github.com/SYSTRAN/fuzzy-match/blob/f5c48febe6969806ba620d15fa14affb6702bb2e/src/fuzzy_match.cc#L465-L477
Do we have a test for this case to make sure the behavior is unchanged?