langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
93.1k stars 14.97k forks source link

When I install the template "self-query-qdrant" I get this error: UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 132: character maps to <undefined> #14341

Closed ivan-schinella closed 6 months ago

ivan-schinella commented 10 months ago

System Info

Windows 10 Pro - 22H2 - Build 9045.3693 - Windows Feature Experience Pack 1000.19053.1000.0 Python 3.11.5 langchain-cli 0.0.19

Who can help?

No response

Information

Related Components

Reproduction

  1. creating a virtual environments with python - m venv .venv and activate it;
  2. installing langchain-cli with pip install -U "langchain-cli[serve]"
  3. launch langchain app new qdrant-app --package self-query-qdrant

Expected behavior

As for other templates that I have installed, I expected to find the .py files of the app in the packages directory, but it's empty. log.txt

dosubot[bot] commented 10 months ago

🤖

Hello @ivan-schinella,

I'm Dosu, a bot here to assist you with your LangChain issues, answer your questions, and guide you to become a contributor while we wait for a human maintainer. I'm excited to help you navigate through this. Let's get started!

The error you're encountering, UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 132: character maps to <undefined>, typically occurs when Python is trying to read a file using an encoding that doesn't support some of the characters in the file.

In your case, it seems like the "self-query-qdrant" template or one of its dependencies might contain such characters. This could be due to a variety of reasons, such as the file being saved with a different encoding than expected, or the file containing special characters that are not supported by the default encoding.

To resolve this issue, you could try the following steps:

  1. Check the encoding of the files: You can use a text editor that allows you to see and change the encoding of files (like Notepad++ or Visual Studio Code) to check the encoding of the template files. If the files are not in UTF-8 encoding, try converting them to UTF-8 and then retry the installation.

  2. Explicitly set the encoding in Python: If the above step doesn't work, you can try setting the default encoding in Python to UTF-8. This can be done by setting the environment variable PYTHONIOENCODING to UTF8. Here's how you can do it in your command prompt:

set PYTHONIOENCODING=UTF8

Then, try to install the template again.

Please note that these are general suggestions and might not work depending on the specific cause of the issue. If the problem persists, it would be helpful if you could provide the exact file and line of code that is causing the error. This would help in identifying the root cause of the issue and finding a more specific solution.

I hope this helps! If you have any more questions or need further clarification, feel free to ask.

Best regards, Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

ivan-schinella commented 10 months ago

@dosu-bot I can't modify files that aren't copied on my PC, I think. Package installation stops before. On the other hands, for example, the package installation of "vertexai-chuck-norris" terminate normally without stops.

dosubot[bot] commented 10 months ago

🤖

Based on the information you've provided and the similar issues I found in the LangChain repository, it seems like the UnicodeDecodeError you're encountering is likely due to the encoding used when reading and writing files in the LangChain CLI.

In the new and add functions in the app.py file, the read_text and write_text methods are used to read and write files. These methods use the default encoding to read and write files, which on Windows is usually 'cp1252'. However, the files you're trying to copy might be encoded in a different encoding like 'utf-8'.

To fix this issue, you can specify the encoding when reading and writing files. Here's how you can do it:

In the new function, replace these lines:

readme_contents = readme.read_text()
readme.write_text(readme_contents.replace("__app_name__", app_name))

pyproject_contents = pyproject.read_text()
pyproject.write_text(pyproject_contents.replace("__app_name__", app_name))

with:

readme_contents = readme.read_text(encoding='utf-8')
readme.write_text(readme_contents.replace("__app_name__", app_name), encoding='utf-8')

pyproject_contents = pyproject.read_text(encoding='utf-8')
pyproject.write_text(pyproject_contents.replace("__app_name__", app_name), encoding='utf-8')

This will ensure that the files are read and written using the 'utf-8' encoding, which should resolve the UnicodeDecodeError you're encountering. You can find the app.py file in the LangChain CLI library.

As for the exact function of the "self-query-qdrant" template in the LangChain framework, I wasn't able to find specific information within the repository. It's possible that the answer may be available elsewhere or I could have missed it. If you have more information about this template or if there's a specific feature or function you're interested in, please let me know so I can provide a more accurate answer.

I hope this helps! If you have any other questions or need further clarification, feel free to ask.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

ivan-schinella commented 10 months ago

@dosu-bot I applied the hints (I found the app.py in my virtual environment .venv\Lib\site-packages\langchain_cli\namespaces), but the installation stops anyway. Initially the installation copy locally (into C:\Users\\AppData\Roaming\langchain\git_repos) all the templates. I found the template "self_query_qdrant" "incriminated" and I noticed that in file pyproject.toml there is line 4 authors = ["Kacper Łukawski <lukawski.kacper@gmail.com>"] with one uncommon character. After that I change it, saved the file and successfully retry the installation!

I think the team must correct that file and update the repo.

dosubot[bot] commented 10 months ago

🤖

Thank you for your detailed report. It seems like you've encountered a known issue with Unicode characters in the LangChain framework. This issue has been previously reported in the LangChain repository, specifically in the Unicode error on Windows issue. The suggested solution for this issue is to enable unicode system wide by setting the PYTHONUTF8=1 environment variable.

However, as you've discovered, the issue can also be resolved by modifying the pyproject.toml file in the "self-query-qdrant" template. The uncommon character "Ł" in the author's name "Kacper Łukawski" is likely causing the 'UnicodeDecodeError' during the installation. Replacing this character with a common ASCII character or removing it should resolve the issue. Here's how you can modify the authors field:

authors = ["Kacper Lukawski <lukawski.kacper@gmail.com>"]

After modifying the pyproject.toml file, you should be able to proceed with the installation without encountering the 'UnicodeDecodeError'. If the error persists, ensure that your file is saved with UTF-8 encoding.

Regarding the installation process of the "self-query-qdrant" template in LangChain, you can refer to the detailed steps provided in the README.md file of the template. The process involves setting up the environment variables, installing the LangChain CLI, creating a new LangChain project, adding the template to the project, initializing the Qdrant collection, adding routes to the server, customizing the template (if needed), and finally starting the FastAPI app.

I hope this information helps. If you have any further questions or issues, please don't hesitate to ask.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.