yougov / fuzzy

MIT License
50 stars 11 forks source link

DMetaphone has issues with long words #5

Open jaraco opened 11 years ago

jaraco commented 11 years ago

Originally reported by: Brian (Bitbucket: eode, GitHub: eode)


#!python

import fuzzy
fdm = fuzzy.DMetaphone()
fdm10 = fuzzy.DMetaphone(10)

# note that this also trounces the 's' phoneme of 'decent'
>>> fdm('decent')
['TKNT', None]

>>> fdm('decentralization')
['TKNT', None]

>>> fdm10('decentralization')
['TKNT', None]

# ..for comparison:
import metaphone
mdm = metaphone.dm

>>> mdm('decent')
('TSNT', '')

>>> mdm('decentralization')
('TSNTRLSXN', '')

Expected behavior:


jaraco commented 11 years ago

Original comment by Sam Ockman (Bitbucket: NewStart, GitHub: NewStart):


Yes, this has bitten me too...

Here's an example I ran across:

for carbohydrate fuzzy gives KRPH as opposed to KRPHTRT for the original library.

It would be great to get this fixed.

Thanks!

jaraco commented 11 years ago

Original comment by Brian (Bitbucket: eode, GitHub: eode):


..edited for formatting, and added expected behavior