langchain-ai / langchain

πŸ¦œπŸ”— Build context-aware reasoning applications
https://python.langchain.com
MIT License
93.78k stars 15.12k forks source link

ContextualCompressionRetriever for non-english languages #15052

Closed weissenbacherpwc closed 6 months ago

weissenbacherpwc commented 10 months ago

Issue you'd like to raise.

Hi,

I want to use the ContextualCompressionRetriever and wondering how the prompt looks like or if you can use it for non-english languages (e.g. German)? I am using ContextualCompressionRetriever at the moment and realized that my LLM responses often switch to English, so I am assuming that the prompt for ContextualCompressionRetriever is in English language and the model gets confused which language to use.

Any recommendations?

Suggestion:

No response

dosubot[bot] commented 10 months ago

πŸ€–

Hallo @weissenbacherpwc! πŸ‘‹

Ich bin Dosu, ein hilfsbereiter Bot, der hier ist, um Ihnen dabei zu helfen, Fehler zu beheben, Fragen zu beantworten und ein Beitragender zu werden. WΓ€hrend wir auf einen menschlichen Maintainer warten, bin ich hier, um Ihnen zu helfen. πŸ˜ŠπŸš€

Based on the information available in the LangChain repository, the ContextualCompressionRetriever does not have language-specific configurations or settings. It uses a base retriever and a base compressor, but it does not specify any language-specific settings or configurations for them. The base retriever and base compressor are used to retrieve and compress documents, respectively, but their configurations are not detailed in the provided context.

Here is the relevant code snippet:

base_compressor: BaseDocumentCompressor
"""Compressor for compressing retrieved documents."""

base_retriever: BaseRetriever
"""Base Retriever to use for getting relevant documents."""

Source: libs/langchain/langchain/retrievers/contextual_compression.py

This suggests that the language model used by the ContextualCompressionRetriever might be defaulting to English, which could be why your LLM responses often switch to English.

To use the ContextualCompressionRetriever for non-English languages, you might need to use a language model that has been trained on the specific language you want to use. If the language model you're using supports multiple languages, you might need to specify the language in the prompt or in the configuration of the language model.

However, without more information about the specific language model you're using and how you're using the ContextualCompressionRetriever, it's difficult to provide a more specific solution. Could you provide more details about your setup?

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a πŸ‘ if this is helpful and πŸ‘Ž if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.