jamesturk / jellyfish

🪼 a python library for doing approximate and phonetic matching of strings.
https://jamesturk.github.io/jellyfish/
MIT License
2.07k stars 160 forks source link

C vs python versions - which one is running? #95

Closed citynorman closed 6 years ago

citynorman commented 6 years ago

How do I know whether the C or pure python versions are running? Can I force use of the C version?

timeit.timeit("jellyfish.levenshtein_distance('LP4h6S3FEVffXKK','vYUEQ7hcnau3L8T')",setup="import jellyfish",number=100)
Out[18]: 0.025986038757366714

timeit.timeit("jellyfish._jellyfish.levenshtein_distance('LP4h6S3FEVffXKK','vYUEQ7hcnau3L8T')",setup="import jellyfish._jellyfish",number=100)
Out[19]: 0.025776229690961827

import jellyfish.cjellyfish
ModuleNotFoundError: No module named 'jellyfish.cjellyfish'

Seems I'm using pure python?

Windows: Python 3.6.1 |Anaconda custom (64-bit)| (default, May 11 2017, 13:25:24) [MSC v. 1900 64 bit (AMD64)] on win32

Linux: Python 3.6.3 |Anaconda, Inc.| (default, Nov 20 2017, 20:41:42) [GCC 7.2.0] on linux

pip install jellyfish => jellyfish 0.5.6

citynorman commented 6 years ago

Seems to work when I do conda install -c conda-forge jellyfish. How do I force compilation of cjellyfish? It's not doing it from pip neither in windows nor linux?

>>> timeit.timeit("jellyfish.levenshtein_distance('LP4h6S3FEVffXKK','vYUEQ7hcnau3L8T')",setup="import jellyfish",number=100)
0.00011578999999528605
>>> timeit.timeit("jellyfish._jellyfish.levenshtein_distance('LP4h6S3FEVffXKK','vYUEQ7hcnau3L8T')",setup="import jellyfish._jellyfish",number=100)
0.016537937000009606
>>> timeit.timeit("jellyfish.cjellyfish.levenshtein_distance('LP4h6S3FEVffXKK','vYUEQ7hcnau3L8T')",setup="import jellyfish._jellyfish",number=100)
0.0002054469999848152

And I've read the docs but as a CPython n00b I'm not sure what that means. "On a typical CPython install the C implementation will be used. The Python versions are available for PyPy and systems where compiling the CPython extension is not possible."

jamesturk commented 6 years ago

I think this is a conda issue, if you use a Python (on Linux w/ python-dev installed) it will compile cjellyfish, if it can't compile for any reason (missing compiler, missing headers, etc.) it will fall back to pyjellyfish

hope that helps