adriantich / DnoisE

Distance denoise by Entropy
GNU General Public License v3.0
12 stars 3 forks source link

installation with install.sh - ModuleNotFoundError: No module named 'rapidfuzz.distance.metrics_py' #24

Open morien opened 1 year ago

morien commented 1 year ago

I'm having some trouble getting DnoisE running on my macOS machine. I have rapidfuzz installed via pip

$ python --version
Python 3.10.8
$ pip --version
pip 22.3.1 from /Users/__________/miniconda3/lib/python3.10/site-packages/pip (python 3.10)
$ pip list
Package                 Version
----------------------- ----------
boltons                 23.0.0
Brotli                  1.0.9
brotlipy                0.7.0
certifi                 2023.5.7
cffi                    1.15.1
charset-normalizer      2.1.1
colorama                0.4.6
conda                   23.5.0
conda-content-trust     0.1.3
conda-package-handling  2.0.2
conda_package_streaming 0.7.0
cryptography            39.0.0
cutadapt                4.2
dnaio                   0.10.0
DnoisE                  1.3
idna                    3.4
isal                    1.1.0
jsonpatch               1.32
jsonpointer             2.0
**Levenshtein             0.21.0**
mutagen                 1.46.0
**Nuitka                  1.6**
numpy                   1.23.5
ordered-set             4.1.0
packaging               23.1
pandas                  2.0.2
pip                     22.3.1
pluggy                  1.0.0
pycosat                 0.6.4
pycparser               2.21
pycryptodomex           3.18.0
pyOpenSSL               23.0.0
PySocks                 1.7.1
python-dateutil         2.8.2
pytz                    2023.3
**rapidfuzz               3.0.0**
requests                2.28.2
ruamel.yaml             0.17.21
ruamel.yaml.clib        0.2.7
setuptools              66.0.0
six                     1.16.0
toolz                   0.12.0
tqdm                    4.64.1
tzdata                  2023.3
urllib3                 1.26.14
websockets              11.0.3
wheel                   0.38.4
xopen                   1.7.0
zstandard               0.19.0

And the path for rapidfuzz is in sys.path, so it should be detected. I'm not sure what is wrong here. Any help is appreciated, thanks.

Here's the full error that's displayed when invoking DnoisE after compilation:

$ ./DnoisE.bin -h
Traceback (most recent call last):
  File "/Users/__________/projects/programs/DnoisE/src/DnoisE.py", line 12, in <module>
    from denoise_functions import *
  File "/Users/__________/projects/programs/DnoisE/src/denoise_functions.py", line 15, in <module denoise_functions>
    import Levenshtein as lv
  File "/Users/__________/miniconda3/lib/python3.10/site-packages/Levenshtein-0.21.0-py3.10-macosx-10.9-x86_64.egg/Levenshtein/__init__.py", line 21, in <module Levenshtein>
    import rapidfuzz.distance.Levenshtein as _Levenshtein
  File "/Users/__________/miniconda3/lib/python3.10/site-packages/rapidfuzz-3.0.0-py3.10-macosx-10.9-x86_64.egg/rapidfuzz/__init__.py", line 10, in <module rapidfuzz>
    from rapidfuzz import distance, fuzz, process, utils
  File "/Users/__________/miniconda3/lib/python3.10/site-packages/rapidfuzz-3.0.0-py3.10-macosx-10.9-x86_64.egg/rapidfuzz/distance/__init__.py", line 6, in <module rapidfuzz.distance>
    from . import (
  File "/Users/__________/miniconda3/lib/python3.10/site-packages/rapidfuzz-3.0.0-py3.10-macosx-10.9-x86_64.egg/rapidfuzz/distance/OSA.py", line 8, in <module rapidfuzz.distance.OSA>
    distance = _fallback_import(_mod, "osa_distance")
  File "/Users/__________/miniconda3/lib/python3.10/site-packages/rapidfuzz-3.0.0-py3.10-macosx-10.9-x86_64.egg/rapidfuzz/_utils.py", line 101, in fallback_import
    py_mod = importlib.import_module(module + "_py")
  File "/Users/__________/miniconda3/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named 'rapidfuzz.distance.metrics_py'
adriantich commented 1 year ago

Can you try to install rapidfuzz again? It seems that it is not finding the metrics_py.py script that should be in "/Users/__/miniconda3/lib/python3.10/site-packages/rapidfuzz-3.0.0-py3.10-macosx-10.9-x86_64.egg/rapidfuzz/distance/". I've checked the github repo from rapidfuzz and the file is there. REPO -> https://github.com/maxbachmann/RapidFuzz metrics_py.py -> https://github.com/maxbachmann/RapidFuzz/blob/main/src/rapidfuzz/distance/metrics_py.py

morien commented 1 year ago

I checked, the metrics_py.py file is there where it should be, and force reinstalling rapidfuzz with pip didn't change the behaviour

adriantich commented 1 year ago

The problem seems to be from the levenshtein package which is calling this other package rapidfuzz. However could you try to install importlib again?

pip install importlib

lets see if this can work.

A

On 31/5/23 12:10, morien wrote:

I checked, the metrics_py.py file is there where it should be, and force reinstalling rapidfuzz with pip didn't change the behaviour

— Reply to this email directly, view it on GitHub https://github.com/adriantich/DnoisE/issues/24#issuecomment-1569902252, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASBYEQCXN2B6CYGBA32OMTTXI4KKPANCNFSM6AAAAAAYVBWUI4. You are receiving this because you commented.Message ID: @.***>

morien commented 1 year ago

looks like importlib was not installed already, but installing it doesn't change the behaviour

morien commented 1 year ago

I'm leaving a report of how I worked around this issue:

I ended up moving back to a CentOS machine that I had trouble with before trying on macOS. The error was very similar, and still related to Levenshtein:

# ./DnoisE.bin -h
Traceback (most recent call last):
  File "/home/________/programs/DnoisE/src/DnoisE.py", line 12, in <module>
    from denoise_functions import *
  File "/home/________/programs/DnoisE/src/denoise_functions.py", line 15, in <module denoise_functions>
    import levenshtein as lv
ModuleNotFoundError: No module named 'Levenshtein'

I'd installed Levenshtein already in my userspace using pip, and my $PATH only pointed to the install of python in my userspace as well.

# which python
~/mambaforge/bin/python
which python3
~/mambaforge/bin/python3

To resolve the issue, I had to install the Levenshtein package globally, using the operating system package manager (i.e. yum install python3-Levenshtein). That seems to have worked, for some reason.

I tried to use homebrew on macOS to accomplish the same, but unfortunately, I didn't have any luck.

bioinfo-arctic commented 1 year ago

Hey guys,

Having the same issue here with a Ubuntu 22.04.2.

lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.2 LTS
Release:        22.04
Codename:       jammy 

$python --version
Python 3.10.6

$ pip --version
pip 22.0.2 from /home/dmo/bioinformatic_tools/obitools3_installation/obitools3/obi3-env/lib/python3.10/site-packages/pip (python 3.10)

$ pip list
Package            Version
------------------ ---------
Cython             0.29.35
DnoisE             1.3
importlib          1.0.4
Levenshtein        0.21.1
Nuitka             1.6.3
numpy              1.25.0rc1
OBITools3          3.0.1b24
ordered-set        4.1.0
pandas             2.0.2
pip                22.0.2
python-dateutil    2.8.2
python-Levenshtein 0.21.1
pytz               2023.3
rapidfuzz          3.1.1
setuptools         59.6.0
six                1.16.0
tqdm               4.65.0
tzdata             2023.3
zstandard          0.21.0
Traceback (most recent call last):
  File "/home/dmo/bioinformatic_tools/Dnoise_installation/DnoisE/src/DnoisE.py", line 12, in <module>
    from denoise_functions import *
  File "/home/dmo/bioinformatic_tools/Dnoise_installation/DnoisE/src/denoise_functions.py", line 15, in <module denoise_functions>
    import Levenshtein as lv
  File "/home/dmo/bioinformatic_tools/obitools3_installation/obitools3/obi3-env/lib/python3.10/site-packages/Levenshtein-0.21.1-py3.10-linux-x86_64.egg/Levenshtein/__init__.py", line 21, in <module Levenshtein>
    import rapidfuzz.distance.Levenshtein as _Levenshtein
  File "/home/dmo/bioinformatic_tools/obitools3_installation/obitools3/obi3-env/lib/python3.10/site-packages/rapidfuzz/__init__.py", line 10, in <module rapidfuzz>
    from rapidfuzz import distance, fuzz, process, utils
  File "/home/dmo/bioinformatic_tools/obitools3_installation/obitools3/obi3-env/lib/python3.10/site-packages/rapidfuzz/distance/__init__.py", line 6, in <module rapidfuzz.distance>
    from . import (
  File "/home/dmo/bioinformatic_tools/obitools3_installation/obitools3/obi3-env/lib/python3.10/site-packages/rapidfuzz/distance/OSA.py", line 8, in <module rapidfuzz.distance.OSA>
    distance = _fallback_import(_mod, "osa_distance")
  File "/home/dmo/bioinformatic_tools/obitools3_installation/obitools3/obi3-env/lib/python3.10/site-packages/rapidfuzz/_utils.py", line 101, in fallback_import
    py_mod = importlib.import_module(module + "_py")
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named 'rapidfuzz.distance.metrics_py'
bioinfo-arctic commented 1 year ago

Hey guys,

Having the same issue here with a Ubuntu 22.04.2.

lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.2 LTS
Release:        22.04
Codename:       jammy 

$python --version
Python 3.10.6

$ pip --version
pip 22.0.2 from /home/dmo/bioinformatic_tools/obitools3_installation/obitools3/obi3-env/lib/python3.10/site-packages/pip (python 3.10)

$ pip list
Package            Version
------------------ ---------
Cython             0.29.35
DnoisE             1.3
importlib          1.0.4
Levenshtein        0.21.1
Nuitka             1.6.3
numpy              1.25.0rc1
OBITools3          3.0.1b24
ordered-set        4.1.0
pandas             2.0.2
pip                22.0.2
python-dateutil    2.8.2
python-Levenshtein 0.21.1
pytz               2023.3
rapidfuzz          3.1.1
setuptools         59.6.0
six                1.16.0
tqdm               4.65.0
tzdata             2023.3
zstandard          0.21.0
Traceback (most recent call last):
  File "/home/dmo/bioinformatic_tools/Dnoise_installation/DnoisE/src/DnoisE.py", line 12, in <module>
    from denoise_functions import *
  File "/home/dmo/bioinformatic_tools/Dnoise_installation/DnoisE/src/denoise_functions.py", line 15, in <module denoise_functions>
    import Levenshtein as lv
  File "/home/dmo/bioinformatic_tools/obitools3_installation/obitools3/obi3-env/lib/python3.10/site-packages/Levenshtein-0.21.1-py3.10-linux-x86_64.egg/Levenshtein/__init__.py", line 21, in <module Levenshtein>
    import rapidfuzz.distance.Levenshtein as _Levenshtein
  File "/home/dmo/bioinformatic_tools/obitools3_installation/obitools3/obi3-env/lib/python3.10/site-packages/rapidfuzz/__init__.py", line 10, in <module rapidfuzz>
    from rapidfuzz import distance, fuzz, process, utils
  File "/home/dmo/bioinformatic_tools/obitools3_installation/obitools3/obi3-env/lib/python3.10/site-packages/rapidfuzz/distance/__init__.py", line 6, in <module rapidfuzz.distance>
    from . import (
  File "/home/dmo/bioinformatic_tools/obitools3_installation/obitools3/obi3-env/lib/python3.10/site-packages/rapidfuzz/distance/OSA.py", line 8, in <module rapidfuzz.distance.OSA>
    distance = _fallback_import(_mod, "osa_distance")
  File "/home/dmo/bioinformatic_tools/obitools3_installation/obitools3/obi3-env/lib/python3.10/site-packages/rapidfuzz/_utils.py", line 101, in fallback_import
    py_mod = importlib.import_module(module + "_py")
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named 'rapidfuzz.distance.metrics_py'

I have tried to fix this using the conda installation, which also keeps crashing, but then I ran:

python3 ./src/Dnoise.py -h 

And the help message came out clean. While if I run:

./src/Dnoise.bin -h

I get the error message about rapidfuzz.distance.metrics_py.

The 2 questions now are: 1) What is the difference between Dnoise.bin and Dnoise.py? 2) What does it mean to have these two different behaviours? Should Dnoise.py be enough to run the algorithm, even though Dnoise.bin is throwing an error?

Cheers, Daniel

adriantich commented 1 year ago

Thank you all for the comments and reports. I'm these weeks on vacation and I can't work much on that and this summer will be quite difficult to spend much time on that but by now:

It all seems to be a problem with the Levenshtein package, I'll try to make a version of DnoisE with the updated code and I'll try to create a distribution binnary to increase the running performance. However DnoisE.py works exactly the same algorithm the only difference would be a small increaase of running time but I don't think it would be that much.

I'll keep you updated on the new updates in this issue. This way i guess github will send you an email!

Thanks for your patience.

Cheers! Adri

bioinfo-arctic commented 1 year ago

Dear Adrià,

Many thanks for your reply. Please enjoy your holidays and don't think about Dnoise haha

I will be happy to discuss it further when you are back.

Best wishes, Daniel


From: Adrià Antich @.> Sent: Monday, June 26, 2023 1:58 PM To: adriantich/DnoisE @.> Cc: Daniel Kumazawa Morais @.>; Comment @.> Subject: Re: [adriantich/DnoisE] installation with install.sh - ModuleNotFoundError: No module named 'rapidfuzz.distance.metrics_py' (Issue #24)

Thank you all for the comments and reports. I'm these weeks on vacation and I can't work much on that and this summer will be quite difficult to spend much time on that but by now:

It all seems to be a problem with the Levenshtein package, I'll try to make a version of DnoisE with the updated code and I'll try to create a distribution binnary to increase the running performance. However DnoisE.py works exactly the same algorithm the only difference would be a small increaase of running time but I don't think it would be that much.

I'll keep you updated on the new updates in this issue. This way i guess github will send you an email!

Thanks for your patience.

Cheers! Adri

— Reply to this email directly, view it on GitHubhttps://github.com/adriantich/DnoisE/issues/24#issuecomment-1607316477, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A6E22KEBYSX3QCE7V3RJMFTXNF2PXANCNFSM6AAAAAAYVBWUI4. You are receiving this because you commented.Message ID: @.***>

Cedar-Mac commented 1 year ago

I also ran into the same issue on MacOS:

ModuleNotFoundError: No module named 'rapidfuzz.distance.metrics_py'

But again only for the binary executable. No rush on my end, just wanted to add info from another machine, and can provide any specific info you might need.