langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
94.61k stars 15.31k forks source link

Volc Engine MaaS has wrong entry in LLM type to class dict (causing SpaCy to not work with LangChain anymore) #14127

Closed DirkKoelewijn closed 11 months ago

DirkKoelewijn commented 11 months ago

System Info

Who can help?

@h3l As the creator of the pull request where VolcEngine was introduced @baskaryan As tag handler of that pull request

Information

Related Components

Reproduction

Anything that triggers spaCy's registry to make an inventory, for example:

import spacy

spacy.blank("en")

With the last part of the Traceback being:

  File "PROJECT_FOLDER\.venv\Lib\site-packages\langchain\llms\__init__.py", line 699, in __getattr__
    k: v() for k, v in get_type_to_cls_dict().items()
       ^^^
  File "PROJECT_FOLDER\.venv\Lib\site-packages\langchain_core\load\serializable.py", line 97, in __init__
    super().__init__(**kwargs)
  File "PROJECT_FOLDER\.venv\Lib\site-packages\pydantic\v1\main.py", line 341, in __init__
    raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for VolcEngineMaasLLM
__root__
  Did not find volc_engine_maas_ak, please add an environment variable `VOLC_ACCESSKEY` which contains it, or pass  `volc_engine_maas_ak` as a named parameter. (type=value_error)

What I think causes this

I am quite certain that this is caused by langchain.llms.__init__.py:869 (for commit b161f30):

def get_type_to_cls_dict() -> Dict[str, Callable[[], Type[BaseLLM]]]:
    return {
        "ai21": _import_ai21,
        "aleph_alpha": _import_aleph_alpha,
        "amazon_api_gateway": _import_amazon_api_gateway,
        ...
        "qianfan_endpoint": _import_baidu_qianfan_endpoint,
        "yandex_gpt": _import_yandex_gpt,
        # Line below is the only that actually calls the import function, returning a class instead of an import function
        "VolcEngineMaasLLM": _import_volcengine_maas(),
    }

The Volc Engine Maas LLM is the only in this dict to actually call the import function, while all other entries only the function itself, and do not call it.

Expected behavior

Class to type dict only returns import functions, not actual classes:

def get_type_to_cls_dict() -> Dict[str, Callable[[], Type[BaseLLM]]]:
    return {
        "ai21": _import_ai21,
        "aleph_alpha": _import_aleph_alpha,
        "amazon_api_gateway": _import_amazon_api_gateway,
        ...
        "qianfan_endpoint": _import_baidu_qianfan_endpoint,
        "yandex_gpt": _import_yandex_gpt,
        # What I think would be correct (now without function call)
        "VolcEngineMaasLLM": _import_volcengine_maas,
    }

Unfortunately I don't have time to put in a PR myself, but I hope this helps finding the solution!

dosubot[bot] commented 11 months ago

🤖

Hello @DirkKoelewijn,

I am Dosu, an AI assistant designed to help users like you with bugs, answer questions, and assist in becoming a contributor to LangChain. I'm here to provide support while we wait for a human maintainer to assist you further. Great to meet you!

Thank you for the detailed report, DirkKoelewijn. You're correct in your analysis. The get_type_to_cls_dict function should return a dictionary of import functions, not instantiated classes. The VolcEngineMaasLLM entry is indeed the only one that's currently calling the import function, which is causing the issue you're experiencing.

Here's the corrected code:

def get_type_to_cls_dict() -> Dict[str, Callable[[], Type[BaseLLM]]]:
    return {
        ...
        "VolcEngineMaasLLM": _import_volcengine_maas,
        ...
    }

And the corrected part in the __init__.py file:

elif name == "type_to_cls_dict":
    # for backwards compatibility
    type_to_cls_dict: Dict[str, Type[BaseLLM]] = {
        k: v for k, v in get_type_to_cls_dict().items()
    }
    return type_to_cls_dict

This change should resolve the issue with SpaCy not working with LangChain. The dictionary will now return the import function _import_volcengine_maas when the key "VolcEngineMaasLLM" is accessed, instead of trying to instantiate the class.

Unfortunately, as a bot, I'm unable to make changes to the codebase. I recommend reaching out to @h3l and @baskaryan, who you've identified as potentially being able to help. They should be able to implement this fix.

I hope this helps, and thank you for your contribution to improving LangChain!

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

DirkKoelewijn commented 11 months ago

For anyone else having this problem:

The workaround for now is to fix the langchain version to 0.0.343 until this issue is resolved.

h3l commented 11 months ago

@DirkKoelewijn Thank you for identifying this bug. I have submitted a Pull Request and am now awaiting for it to be merged.

DirkKoelewijn commented 11 months ago

Thank you very much!