Describe the bug

Connecting to OpenLLM isn't an issue from the look of it, but actually using it from langchain is

To reproduce

code

from langchain_community.llms import OpenLLM

print("connecting") llm = OpenLLM(server_url='http://localhost:3000') print("sending")

llm.invoke('What is the difference between a duck and a goose? And why there are so many Goose in Canada?')

note that it gets all the way up to print("sending") before it gives the error

the following code does work

import openllm_client client = openllm_client.HTTPClient('http://localhost:3000') print(client.generate('What are large language models?'))

Logs

Traceback (most recent call last):
  File "/mnt/c/Users/Public/Documents/PythonProjectsVenv1/help/plz.py", line 8, in <module>
    llm.invoke('What is the difference between a duck and a goose? And why there are so many Goose in Canada?')
  File "/home/ssikki/.virtualenvs/openllm_test/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 276, in invoke
    self.generate_prompt(
  File "/home/ssikki/.virtualenvs/openllm_test/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 633, in generate_prompt
    return self.generate(prompt_strings, stop=stop, callbacks=callbacks, **kwargs)
  File "/home/ssikki/.virtualenvs/openllm_test/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 803, in generate
    output = self._generate_helper(
  File "/home/ssikki/.virtualenvs/openllm_test/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 670, in _generate_helper
    raise e
  File "/home/ssikki/.virtualenvs/openllm_test/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 657, in _generate_helper
    self._generate(
  File "/home/ssikki/.virtualenvs/openllm_test/lib/python3.10/site-packages/langchain_core/language_models/llms.py", line 1317, in _generate
    self._call(prompt, stop=stop, run_manager=run_manager, **kwargs)
  File "/home/ssikki/.virtualenvs/openllm_test/lib/python3.10/site-packages/langchain_community/llms/openllm.py", line 273, in _call
    self._client.generate(prompt, **config.model_dump(flatten=True))
TypeError: BaseModel.model_dump() got an unexpected keyword argument 'flatten'

Environment

bentoml env

Environment variable

BENTOML_DEBUG=''
BENTOML_QUIET=''
BENTOML_BUNDLE_LOCAL_BUILD=''
BENTOML_DO_NOT_TRACK=''
BENTOML_CONFIG=''
BENTOML_CONFIG_OPTIONS=''
BENTOML_PORT=''
BENTOML_HOST=''
BENTOML_API_WORKERS=''

System information

bentoml: 1.2.17 python: 3.10.12 platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35 uid_gid: 1000:1000

pip_packages

``` aiohttp==3.9.5 aiosignal==1.3.1 annotated-types==0.7.0 anyio==4.4.0 appdirs==1.4.4 asgiref==3.8.1 async-timeout==4.0.3 attrs==23.2.0 bentoml==1.2.17 blinker==1.4 build==1.2.1 cattrs==23.1.2 certifi==2024.6.2 charset-normalizer==3.3.2 circus==0.18.0 click==8.1.7 click-option-group==0.5.6 cloudpickle==3.0.0 cmake==3.29.3 command-not-found==0.3 cryptography==3.4.8 cuda-python==12.5.0 dataclasses-json==0.6.7 dbus-python==1.2.18 deepmerge==1.1.1 Deprecated==1.2.14 diskcache==5.6.3 distro==1.7.0 distro-info==1.1+ubuntu0.1 dnspython==2.6.1 einops==0.8.0 email_validator==2.1.1 exceptiongroup==1.2.1 fastapi==0.111.0 fastapi-cli==0.0.4 fastcore==1.5.44 filelock==3.14.0 frozenlist==1.4.1 fs==2.4.16 fsspec==2024.6.0 ghapi==1.0.5 greenlet==3.0.3 h11==0.14.0 httpcore==1.0.5 httplib2==0.20.2 httptools==0.6.1 httpx==0.27.0 huggingface-hub==0.23.3 idna==3.7 importlib-metadata==6.11.0 inflection==0.5.1 interegular==0.3.3 jeepney==0.7.1 Jinja2==3.1.4 joblib==1.4.2 jsonpatch==1.33 jsonpointer==3.0.0 jsonschema==4.22.0 jsonschema-specifications==2023.12.1 keyring==23.5.0 langchain==0.2.4 langchain-community==0.2.4 langchain-core==0.2.6 langchain-text-splitters==0.2.1 langsmith==0.1.77 lark==1.1.9 launchpadlib==1.10.16 lazr.restfulclient==0.14.4 lazr.uri==1.0.6 llvmlite==0.42.0 lm-format-enforcer==0.10.1 markdown-it-py==3.0.0 MarkupSafe==2.1.5 marshmallow==3.21.3 mdurl==0.1.2 more-itertools==8.10.0 mpmath==1.3.0 msgpack==1.0.8 multidict==6.0.5 mypy-extensions==1.0.0 nest-asyncio==1.6.0 netifaces==0.11.0 networkx==3.3 ninja==1.11.1.1 numba==0.59.1 numpy==1.26.4 nvidia-cublas-cu12==12.1.3.1 nvidia-cuda-cupti-cu12==12.1.105 nvidia-cuda-nvrtc-cu12==12.1.105 nvidia-cuda-runtime-cu12==12.1.105 nvidia-cudnn-cu12==8.9.2.26 nvidia-cufft-cu12==11.0.2.54 nvidia-curand-cu12==10.3.2.106 nvidia-cusolver-cu12==11.4.5.107 nvidia-cusparse-cu12==12.1.0.106 nvidia-ml-py==11.525.150 nvidia-nccl-cu12==2.20.5 nvidia-nvjitlink-cu12==12.5.40 nvidia-nvtx-cu12==12.1.105 oauthlib==3.2.0 openai==1.32.0 openllm==0.5.5 openllm-client==0.5.5 openllm-core==0.5.5 opentelemetry-api==1.20.0 opentelemetry-instrumentation==0.41b0 opentelemetry-instrumentation-aiohttp-client==0.41b0 opentelemetry-instrumentation-asgi==0.41b0 opentelemetry-sdk==1.20.0 opentelemetry-semantic-conventions==0.41b0 opentelemetry-util-http==0.41b0 orjson==3.10.3 outlines==0.0.34 packaging==24.0 pandas==2.2.2 pathspec==0.12.1 pillow==10.3.0 pip-requirements-parser==32.0.1 pip-tools==7.4.1 prometheus-fastapi-instrumentator==7.0.0 prometheus_client==0.20.0 protobuf==5.27.1 psutil==5.9.8 py-cpuinfo==9.0.0 pyarrow==16.1.0 pydantic==2.7.3 pydantic_core==2.18.4 Pygments==2.18.0 PyGObject==3.42.1 PyJWT==2.3.0 pyparsing==2.4.7 pyproject_hooks==1.1.0 python-apt==2.4.0+ubuntu2 python-dateutil==2.9.0.post0 python-dotenv==1.0.1 python-json-logger==2.0.7 python-multipart==0.0.9 pytz==2024.1 PyYAML==5.4.1 pyzmq==26.0.3 ray==2.24.0 referencing==0.35.1 regex==2024.5.15 requests==2.32.3 rich==13.7.1 rpds-py==0.18.1 safetensors==0.4.3 schema==0.7.7 scipy==1.13.1 SecretStorage==3.3.1 sentencepiece==0.2.0 shellingham==1.5.4 simple-di==0.1.5 six==1.16.0 sniffio==1.3.1 SQLAlchemy==2.0.30 starlette==0.37.2 sympy==1.12.1 systemd-python==234 tenacity==8.3.0 tiktoken==0.7.0 tokenizers==0.19.1 tomli==2.0.1 tomli_w==1.0.0 torch==2.3.0 tornado==6.4.1 tqdm==4.66.4 transformers==4.41.2 triton==2.3.0 typer==0.12.3 typing-inspect==0.9.0 typing_extensions==4.12.1 tzdata==2024.1 ubuntu-advantage-tools==8001 ufw==0.36.1 ujson==5.10.0 unattended-upgrades==0.1 urllib3==2.2.1 uvicorn==0.30.1 uvloop==0.19.0 vllm==0.4.3 vllm-flash-attn==2.5.8.post2 wadllib==1.3.6 watchfiles==0.22.0 websockets==12.0 wrapt==1.16.0 xformers==0.0.26.post1 yarl==1.9.4 zipp==1.0.0 ```

transformers-cli env

transformers version: 4.41.2
Platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Python version: 3.10.12
Huggingface_hub version: 0.23.3
Safetensors version: 0.4.3
Accelerate version: not installed
Accelerate config: not found
PyTorch version (GPU?): 2.3.0+cu121 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: No, but GPU is detected and running OpenLLM
Using distributed or parallel set-up in script?: No

System information (Optional)

Processor AMD Ryzen 5 7600X 6-Core Processor 4.70 GHz Installed RAM 32.0 GB (31.2 GB usable) GPU Nvidia GeForce RTX 4060 Ti

bentoml / OpenLLM

bug: Attempting to invoke OpenLLM from Langchain results in error #1014