Open lazzarello opened 2 years ago
Hi @lazzarello The fuzzy matching logic still sees enough similarities between them to include it in the results. You are right that the underscore character is treated differently. That's because internally we convert all spaces into underscores. Maybe internally we should switch from using underscore for that purpose to a Unicode character that is barely used.
Describe the bug Adding a special character (an underscore) to
valid_chars_for_string
does not exclude results which do not have the character in the string, until two misses.To Reproduce
Initialized with
Formatted output with simulated input:
Expected behavior Much like the input 'ir' excludes 'i_lovecode' I would expect 'iron' to exclude 'ironman' and so forth. From this output, it looks like it only begins to exclude 'ironman' when the input reaches 'iron_ma'.
OS, DeepDiff version and Python version (please complete the following information):
Additional context
This seems to have something to do with the max_cost parameter. If I raise it > 2 it matches even more then the unexpected results.