DeNederlandscheBank / name_matching

Other
128 stars 43 forks source link

ValueError: not found in any keyboard layouts #12

Closed ZorkJ closed 1 year ago

ZorkJ commented 1 year ago

image

I am wondering what potential reason can get such ValueError.

mnijhuis-dnb commented 1 year ago

The reason can be that one of the tokens in the two names you are comparing, is not found in any keyboard layout that the typo distance uses. The typo distance uses the distance on a keyboard layout between the characters of the two strings. For a QWERTY layout the distance between 'error' and 'errot' is for instance 1, as the t is just 1 key away from the r. To avoid this error you can either choose to use a different distance metrics instead of the typo metric, or set the option to convert all characters to ASCII characters