langchain-ai / langchain

šŸ¦œšŸ”— Build context-aware reasoning applications
https://python.langchain.com
MIT License
88.49k stars 13.9k forks source link

ModuleNotFoundError: No module named 'rapidfuzz' #12237

Open jamesbraza opened 8 months ago

jamesbraza commented 8 months ago

System Info

https://github.com/langchain-ai/langchain/tree/d2cb95c39d5569019ab3c6aa368aa937d8dcc465 (just above v0.0.322)

MacBook Pro, M1 chip, macOS Ventura 13.5.2

Who can help?

No response

Information

Related Components

Reproduction

With Python 3.10.11:

python -m venv venv
source venv/bin/activate
pip install poetry
poetry install --with test
make test

This gets:

poetry run pytest --disable-socket --allow-unix-socket tests/unit_tests/
Traceback (most recent call last):
  File "/Users/james.braza/code/langchain/venv/bin/poetry", line 5, in <module>
    from poetry.console.application import main
  File "/Users/james.braza/code/langchain/venv/lib/python3.10/site-packages/poetry/console/application.py", line 11, in <module>
    from cleo.application import Application as BaseApplication
  File "/Users/james.braza/code/langchain/venv/lib/python3.10/site-packages/cleo/application.py", line 12, in <module>
    from cleo.commands.completions_command import CompletionsCommand
  File "/Users/james.braza/code/langchain/venv/lib/python3.10/site-packages/cleo/commands/completions_command.py", line 10, in <module>
    from cleo import helpers
  File "/Users/james.braza/code/langchain/venv/lib/python3.10/site-packages/cleo/helpers.py", line 5, in <module>
    from cleo.io.inputs.argument import Argument
  File "/Users/james.braza/code/langchain/venv/lib/python3.10/site-packages/cleo/io/inputs/argument.py", line 5, in <module>
    from cleo.exceptions import CleoLogicError
  File "/Users/james.braza/code/langchain/venv/lib/python3.10/site-packages/cleo/exceptions/__init__.py", line 3, in <module>
    from cleo._utils import find_similar_names
  File "/Users/james.braza/code/langchain/venv/lib/python3.10/site-packages/cleo/_utils.py", line 8, in <module>
    from rapidfuzz.distance import Levenshtein
ModuleNotFoundError: No module named 'rapidfuzz'

Expected behavior

I expected installation to install everything necessary to run tests

dosubot[bot] commented 8 months ago

šŸ¤–

Hello,

The error you're encountering is due to the missing 'rapidfuzz' module, which is a dependency for LangChain. This library is used for string distance evaluations in LangChain.

To resolve this issue, you need to install the RapidFuzz library. You can do this by running the following command in your terminal:

pip install rapidfuzz

After installing RapidFuzz, you should be able to run your tests without encountering the "ModuleNotFoundError".

If you continue to face any issues, please let me know.

Best, Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a šŸ‘ if this is helpful and šŸ‘Ž if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

jamesbraza commented 8 months ago

Thank you @dosubot, yes I can manually install rapidfuzz, but the point is that langchain should have installed everything right off the bat when I ran poetry install --with test

dosubot[bot] commented 8 months ago

šŸ¤–

Hello,

Thank you for reaching out and providing detailed information about your issue. The reason you're encountering a ModuleNotFoundError for 'rapidfuzz' is because it's not listed in the 'pyproject.toml' file of the 'langchain-experimental' library. The poetry install command installs the dependencies listed in the 'pyproject.toml' file. If 'rapidfuzz' is not included in this file, it won't be installed.

To ensure all dependencies are installed automatically in the future, you need to make sure they are included in the 'pyproject.toml' file under the appropriate group. If 'rapidfuzz' is a main dependency of your project, it should be listed under the [tool.poetry.dependencies] section. If it's a development, testing, or documentation dependency, it should be listed under the corresponding section, like [tool.poetry.dev-dependencies], [tool.poetry.test-dependencies], or [tool.poetry.docs-dependencies].

Here is an example of how your 'pyproject.toml' file should look if 'rapidfuzz' is a main dependency:

[tool.poetry]
name = "langchain-experimental"
version = "0.0.34"
description = "Building applications with LLMs through composability"

[tool.poetry.dependencies]
python = ">=3.8.1,<4.0"
langchain = ">=0.0.308"
rapidfuzz = "^1.4.1"

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

After updating your 'pyproject.toml' file, the 'rapidfuzz' module should be installed automatically when you run the poetry install command.

In the context shared, the Dockerfile uses the poetry install --no-interaction --no-ansi --only main command. This command only installs the main dependencies listed in the 'pyproject.toml' file. If 'rapidfuzz' is a development, testing, or documentation dependency, it won't be installed with this command. You need to run poetry install --no-interaction --no-ansi without the --only main option to install all dependencies.

Regarding your question about 'rapidfuzz' being a necessary dependency for the LangChain framework, it is indeed a necessary dependency, especially if you are using the string distance evaluation feature. The 'rapidfuzz' library is used in the StringDistanceEvalChain and PairwiseStringDistanceEvalChain classes to compute string distances between predictions and references. If the 'rapidfuzz' library is not installed, an ImportError will be raised when these classes are used. Therefore, while 'rapidfuzz' is marked as optional in the project's pyproject.toml file, it is required for these specific features.

I hope this helps! If you have any other questions, feel free to ask.

Best, Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a šŸ‘ if this is helpful and šŸ‘Ž if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

jamesbraza commented 8 months ago

I think it's because:

So there may be some dependency mismatching

dlqqq commented 7 months ago

Just ran into this issue myself. Why does poetry not raise a warning here?

kzmain commented 7 months ago

Just ran into this issue myself. Why does poetry not raise a warning here?

I fixed this issue by pip3 install build pyproject-hooks requests-toolbelt pip3 install --upgrade rapidfuzz filelock msgpack

khaledadrani commented 7 months ago

I faced the same issue. The mistake was installing both poetry and langchain in the same virtual environment.

First, you have to install poetry outside of your langchain environment using pipx

pip install pipx # it is ok to install this globally
pipx install poetry

After, poetry will be accessible within any virtual environment. This will also make sure that the dependencies of langchain and of poetry do not get mixed. You can then use poetry within your virtual environment.

I hope this help.

jamesbraza commented 7 months ago

Many other repos using poetry don't have this issue. I appreciate all the workaround posed, but the dependency resolution needs to be fixed in LangChain's requirements.

dosubot[bot] commented 7 months ago

Thank you for bringing this to our attention. It seems like there is indeed a conflict between the 'rapidfuzz' versions required by 'cleo' and 'langchain'.

As you've mentioned, 'cleo==2.0.1' requires 'rapidfuzz<3' while 'langchain==0.0.322' requires 'rapidfuzz>=3'. This is a common issue in Python dependency management and can be resolved in a few ways.

One possible solution is to use a virtual environment for each project to isolate their dependencies. However, since you're already using a virtual environment, this solution may not be applicable.

Another solution is to update the 'cleo' package to a version that is compatible with 'rapidfuzz>=3'. However, this may not be possible if 'cleo' has not released such a version.

The most straightforward solution would be to update the 'langchain' package to be compatible with 'rapidfuzz<3'. This would involve changing the 'rapidfuzz' requirement in the 'langchain' package's 'pyproject.toml' file from '^3.1.1' to '<3'. However, this may not be possible if 'langchain' relies on features only available in 'rapidfuzz>=3'.

Given the information you've provided, I would recommend trying to update the 'cleo' package first. If that's not possible, then the 'langchain' package should be updated. If neither of these solutions work, then you may need to consider using a different package that does not have this conflict.

I hope this helps! If you have any further questions, please don't hesitate to ask.

Sources

itallominatti commented 5 months ago

In my case my poetry had been randomly uninstalled, I solved it by reinstalling poetry

JanethL commented 3 months ago

I had to install rapidfuzz and requests_toolbelt

pip install rapidfuzz
pip install requests_toolbelt

cotarelorodrigo commented 2 months ago

I had to install rapidfuzz and requests_toolbelt

pip install rapidfuzz pip install requests_toolbelt

+1