lvphj / epydemiology

Python code for epidemiologists – eventually
MIT License
8 stars 2 forks source link

Checking UK postcodes and calculating Damerau-Levenshtein distance fails with pyxDamerauLevenshtein-1.7.0 installed #44

Closed lvphj closed 3 years ago

lvphj commented 3 years ago

Calling epy.phjCleanUKPostcodeVariable() function with parameter phjCheckByOption = 'dictionary' with pyxDamerauLevenshtein-1.7.0 installed fails with:

AttributeError: ("module 'pyxdameraulevenshtein' has no attribute 'damerau_levenshtein_distance_ndarray'", 'occurred at index 15')

Previous versions of pyxDamerauLevenshtein worked correctly.

lvphj commented 3 years ago

This issue arose due to update to pyxdameraulevenshtein-1.7.0 library. The issue is described in issue

Issue is described on pyxDamerauLevenshtein GitHub Issues page "ImportError: cannot import name 'damerau_levenshtein_distance_ndarray'" (issue 30).

Work around and solution documented in changelog entry for 1.7.0 (2021-02-09) can be found in changelog.

lvphj commented 3 years ago

Code added to call correct function depending on installed version of library:

if pkg_resources.get_distribution("pyxdameraulevenshtein").version < '1.7.0':
        df['new_col'] = pyxdl.damerau_levenshtein_distance_ndarray()
else:
        df['new_col'] = pyxdl.damerau_levenshtein_distance_seqs()