bentoml / OpenLLM

Run any open-source LLMs, such as Llama 3.1, Gemma, as OpenAI compatible API endpoint in the cloud.
https://bentoml.com
Apache License 2.0
9.4k stars 598 forks source link

bug: Attempting to invoke OpenLLM from Langchain results in error #1014

Closed Said-Ikki closed 1 week ago

Said-Ikki commented 1 month ago

Describe the bug

Connecting to OpenLLM isn't an issue from the look of it, but actually using it from langchain is

To reproduce

code

from langchain_community.llms import OpenLLM

print("connecting") llm = OpenLLM(server_url='http://localhost:3000') print("sending")

llm.invoke('What is the difference between a duck and a goose? And why there are so many Goose in Canada?')

note that it gets all the way up to print("sending") before it gives the error

the following code does work

import openllm_client client = openllm_client.HTTPClient('http://localhost:3000') print(client.generate('What are large language models?'))

Logs

Traceback (most recent call last):
  File "/mnt/c/Users/Public/Documents/PythonProjectsVenv1/help/plz.py", line 8, in <module>
    llm.invoke('What is the difference between a duck and a goose? And why there are so many Goose in Canada?')
  File "/home/ssikki/.virtualenvs/openllm_test/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 276, in invoke
    self.generate_prompt(
  File "/home/ssikki/.virtualenvs/openllm_test/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 633, in generate_prompt
    return self.generate(prompt_strings, stop=stop, callbacks=callbacks, **kwargs)
  File "/home/ssikki/.virtualenvs/openllm_test/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 803, in generate
    output = self._generate_helper(
  File "/home/ssikki/.virtualenvs/openllm_test/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 670, in _generate_helper
    raise e
  File "/home/ssikki/.virtualenvs/openllm_test/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 657, in _generate_helper
    self._generate(
  File "/home/ssikki/.virtualenvs/openllm_test/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 1317, in _generate
    self._call(prompt, stop=stop, run_manager=run_manager, **kwargs)
  File "/home/ssikki/.virtualenvs/openllm_test/lib/python3.10/site-packages/langchain_community/llms/openllm.py", line 273, in _call
    self._client.generate(prompt, **config.model_dump(flatten=True))
TypeError: BaseModel.model_dump() got an unexpected keyword argument 'flatten'

Environment

bentoml env

Environment variable

BENTOML_DEBUG=''
BENTOML_QUIET=''
BENTOML_BUNDLE_LOCAL_BUILD=''
BENTOML_DO_NOT_TRACK=''
BENTOML_CONFIG=''
BENTOML_CONFIG_OPTIONS=''
BENTOML_PORT=''
BENTOML_HOST=''
BENTOML_API_WORKERS=''

System information

bentoml: 1.2.17 python: 3.10.12 platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35 uid_gid: 1000:1000

pip_packages
``` aiohttp==3.9.5 aiosignal==1.3.1 annotated-types==0.7.0 anyio==4.4.0 appdirs==1.4.4 asgiref==3.8.1 async-timeout==4.0.3 attrs==23.2.0 bentoml==1.2.17 blinker==1.4 build==1.2.1 cattrs==23.1.2 certifi==2024.6.2 charset-normalizer==3.3.2 circus==0.18.0 click==8.1.7 click-option-group==0.5.6 cloudpickle==3.0.0 cmake==3.29.3 command-not-found==0.3 cryptography==3.4.8 cuda-python==12.5.0 dataclasses-json==0.6.7 dbus-python==1.2.18 deepmerge==1.1.1 Deprecated==1.2.14 diskcache==5.6.3 distro==1.7.0 distro-info==1.1+ubuntu0.1 dnspython==2.6.1 einops==0.8.0 email_validator==2.1.1 exceptiongroup==1.2.1 fastapi==0.111.0 fastapi-cli==0.0.4 fastcore==1.5.44 filelock==3.14.0 frozenlist==1.4.1 fs==2.4.16 fsspec==2024.6.0 ghapi==1.0.5 greenlet==3.0.3 h11==0.14.0 httpcore==1.0.5 httplib2==0.20.2 httptools==0.6.1 httpx==0.27.0 huggingface-hub==0.23.3 idna==3.7 importlib-metadata==6.11.0 inflection==0.5.1 interegular==0.3.3 jeepney==0.7.1 Jinja2==3.1.4 joblib==1.4.2 jsonpatch==1.33 jsonpointer==3.0.0 jsonschema==4.22.0 jsonschema-specifications==2023.12.1 keyring==23.5.0 langchain==0.2.4 langchain-community==0.2.4 langchain-core==0.2.6 langchain-text-splitters==0.2.1 langsmith==0.1.77 lark==1.1.9 launchpadlib==1.10.16 lazr.restfulclient==0.14.4 lazr.uri==1.0.6 llvmlite==0.42.0 lm-format-enforcer==0.10.1 markdown-it-py==3.0.0 MarkupSafe==2.1.5 marshmallow==3.21.3 mdurl==0.1.2 more-itertools==8.10.0 mpmath==1.3.0 msgpack==1.0.8 multidict==6.0.5 mypy-extensions==1.0.0 nest-asyncio==1.6.0 netifaces==0.11.0 networkx==3.3 ninja==1.11.1.1 numba==0.59.1 numpy==1.26.4 nvidia-cublas-cu12==12.1.3.1 nvidia-cuda-cupti-cu12==12.1.105 nvidia-cuda-nvrtc-cu12==12.1.105 nvidia-cuda-runtime-cu12==12.1.105 nvidia-cudnn-cu12==8.9.2.26 nvidia-cufft-cu12==11.0.2.54 nvidia-curand-cu12==10.3.2.106 nvidia-cusolver-cu12==11.4.5.107 nvidia-cusparse-cu12==12.1.0.106 nvidia-ml-py==11.525.150 nvidia-nccl-cu12==2.20.5 nvidia-nvjitlink-cu12==12.5.40 nvidia-nvtx-cu12==12.1.105 oauthlib==3.2.0 openai==1.32.0 openllm==0.5.5 openllm-client==0.5.5 openllm-core==0.5.5 opentelemetry-api==1.20.0 opentelemetry-instrumentation==0.41b0 opentelemetry-instrumentation-aiohttp-client==0.41b0 opentelemetry-instrumentation-asgi==0.41b0 opentelemetry-sdk==1.20.0 opentelemetry-semantic-conventions==0.41b0 opentelemetry-util-http==0.41b0 orjson==3.10.3 outlines==0.0.34 packaging==24.0 pandas==2.2.2 pathspec==0.12.1 pillow==10.3.0 pip-requirements-parser==32.0.1 pip-tools==7.4.1 prometheus-fastapi-instrumentator==7.0.0 prometheus_client==0.20.0 protobuf==5.27.1 psutil==5.9.8 py-cpuinfo==9.0.0 pyarrow==16.1.0 pydantic==2.7.3 pydantic_core==2.18.4 Pygments==2.18.0 PyGObject==3.42.1 PyJWT==2.3.0 pyparsing==2.4.7 pyproject_hooks==1.1.0 python-apt==2.4.0+ubuntu2 python-dateutil==2.9.0.post0 python-dotenv==1.0.1 python-json-logger==2.0.7 python-multipart==0.0.9 pytz==2024.1 PyYAML==5.4.1 pyzmq==26.0.3 ray==2.24.0 referencing==0.35.1 regex==2024.5.15 requests==2.32.3 rich==13.7.1 rpds-py==0.18.1 safetensors==0.4.3 schema==0.7.7 scipy==1.13.1 SecretStorage==3.3.1 sentencepiece==0.2.0 shellingham==1.5.4 simple-di==0.1.5 six==1.16.0 sniffio==1.3.1 SQLAlchemy==2.0.30 starlette==0.37.2 sympy==1.12.1 systemd-python==234 tenacity==8.3.0 tiktoken==0.7.0 tokenizers==0.19.1 tomli==2.0.1 tomli_w==1.0.0 torch==2.3.0 tornado==6.4.1 tqdm==4.66.4 transformers==4.41.2 triton==2.3.0 typer==0.12.3 typing-inspect==0.9.0 typing_extensions==4.12.1 tzdata==2024.1 ubuntu-advantage-tools==8001 ufw==0.36.1 ujson==5.10.0 unattended-upgrades==0.1 urllib3==2.2.1 uvicorn==0.30.1 uvloop==0.19.0 vllm==0.4.3 vllm-flash-attn==2.5.8.post2 wadllib==1.3.6 watchfiles==0.22.0 websockets==12.0 wrapt==1.16.0 xformers==0.0.26.post1 yarl==1.9.4 zipp==1.0.0 ```

transformers-cli env

System information (Optional)

Processor AMD Ryzen 5 7600X 6-Core Processor 4.70 GHz Installed RAM 32.0 GB (31.2 GB usable) GPU Nvidia GeForce RTX 4060 Ti

aarnphm commented 1 month ago

There is a PR upstream updating langchain implementation of openllm yet to be merged IIRC.

https://github.com/langchain-ai/langchain/pull/22442

bojiang commented 1 week ago

close for openllm 0.6