intel / intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
Apache License 2.0
2.14k stars 211 forks source link

RuntimeError: Chatbot instance has not been set. #1308

Closed regmibijay closed 8 months ago

regmibijay commented 8 months ago

Hello,

I started a neuralchat server with following config with command neuralchat_server start --config_file ./server/config/neuralchat.yaml

Config

host: 0.0.0.0
port: <port> # confirmed valid port  

model_name_or_path: "Intel/gpt-j-6B-int8-dynamic-inc" # "Intel/neural-chat-7b-v3-1"
# tokenizer_name_or_path: ""
# peft_model_path: ""
device: "cpu"

and server starts up after downloading model from HF but produces following error while calling http://<ip>:<port>/v1/chat/completions

Error traceback

INFO:     192.168.178.24:28352 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 412, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/starlette/applications.py", line 123, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__
    raise exc
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/starlette/middleware/cors.py", line 83, in __call__
    await self.app(scope, receive, send)
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/starlette/routing.py", line 758, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/starlette/routing.py", line 778, in app
    await route.handle(scope, receive, send)
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/starlette/routing.py", line 299, in handle
    await self.app(scope, receive, send)
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/starlette/routing.py", line 79, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/starlette/routing.py", line 74, in app
    response = await func(request)
               ^^^^^^^^^^^^^^^^^^^
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/fastapi/routing.py", line 299, in app
    raise e
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/fastapi/routing.py", line 294, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/intel_extension_for_transformers/neural_chat/server/restful/textchat_api.py", line 483, in create_chat_completion
    error_check_ret = await check_model(request)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/intel_extension_for_transformers/neural_chat/server/restful/textchat_api.py", line 106, in check_model
    if request.model in router.get_chatbot().model_name:
                        ^^^^^^^^^^^^^^^^^^^^
  File "/home/bijayregmi/Projects/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/neuralchat_venv_311/lib/python3.11/site-packages/intel_extension_for_transformers/neural_chat/server/restful/textchat_api.py", line 386, in get_chatbot
    raise RuntimeError("Chatbot instance has not been set.")
RuntimeError: Chatbot instance has not been set.

Here is the example cURL:

cURL call

 curl http://<ip>:<port>/v1/chat/completions     -H "Content-Type: application/json"     -d '{
    "model": "Intel/gpt-j-6B-int8-dynamic-inc",
    "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Tell me about Intel Xeon Scalable Processors."}
    ]
    }'
regmibijay commented 8 months ago

Package versions

absl-py==2.1.0
accelerate==0.27.2
aiohttp==3.9.3
aiosignal==1.3.1
antlr4-python3-runtime==4.9.3
anyio==4.3.0
astunparse==1.6.3
attrs==23.2.0
beautifulsoup4==4.12.3
blinker==1.7.0
cachetools==5.3.2
cchardet==2.1.7
certifi==2024.2.2
chardet==5.2.0
charset-normalizer==3.3.2
click==8.1.7
colorama==0.4.6
coloredlogs==15.0.1
contextlib2==21.6.0
contourpy==1.2.0
cycler==0.12.1
Cython==3.0.8
DataProperty==1.0.1
datasets==2.17.1
deepface==0.0.84
Deprecated==1.2.14
dill==0.3.8
distro==1.9.0
einops==0.7.0
evaluate==0.4.1
ExifRead==3.0.0
fastapi==0.109.2
filelock==3.13.1
fire==0.5.0
Flask==3.0.2
flatbuffers==23.5.26
fonttools==4.49.0
frozenlist==1.4.1
fschat==0.2.32
fsspec==2023.10.0
gast==0.5.4
gdown==5.1.0
google-auth==2.28.1
google-auth-oauthlib==1.2.0
google-pasta==0.2.0
grpcio==1.62.0
gunicorn==21.2.0
h11==0.14.0
h5py==3.10.0
httpcore==1.0.4
httpx==0.27.0
huggingface-hub==0.20.3
humanfriendly==10.0
idna==3.6
intel-extension-for-pytorch==2.1.0
intel-extension-for-transformers==1.3.2
itsdangerous==2.1.2
Jinja2==3.1.3
joblib==1.3.2
jsonlines==4.0.0
keras==2.15.0
kiwisolver==1.4.5
libclang==16.0.6
lm-eval @ git+https://github.com/EleutherAI/lm-evaluation-harness.git@cc9778fbe4fa1a709be2abed9deb6180fd40e7e2
Markdown==3.5.2
markdown-it-py==3.0.0
markdown2==2.4.12
MarkupSafe==2.1.5
matplotlib==3.8.3
mbstrdecoder==1.1.3
mdurl==0.1.2
ml-dtypes==0.2.0
mpmath==1.3.0
mtcnn==0.1.1
multidict==6.0.5
multiprocess==0.70.16
networkx==3.2.1
neural-compressor==2.4.1
neural-speed==0.3
nh3==0.2.15
nltk==3.8.1
numexpr==2.9.0
numpy==1.23.5
oauthlib==3.2.2
omegaconf==2.3.0
openai==1.12.0
opencv-python==4.9.0.80
opencv-python-headless==4.9.0.80
opt-einsum==3.3.0
optimum==1.17.1
optimum-intel==1.15.2
packaging==23.2
pandas==2.2.1
pathvalidate==3.2.0
peft==0.6.2
pillow==10.2.0
pkg_resources==0.0.0
portalocker==2.8.2
prettytable==3.10.0
prompt-toolkit==3.0.43
protobuf==4.25.3
psutil==5.9.8
py-cpuinfo==9.0.0
pyarrow==15.0.0
pyarrow-hotfix==0.6
pyasn1==0.5.1
pyasn1-modules==0.3.0
pybind11==2.11.1
pycocotools==2.0.7
pycountry==23.12.11
pydantic==1.10.13
pydub==0.25.1
Pygments==2.17.2
PyMySQL==1.1.0
pyparsing==3.1.1
PySocks==1.7.1
pytablewriter==1.2.0
python-dateutil==2.8.2
python-dotenv==1.0.1
python-multipart==0.0.9
pytz==2024.1
PyYAML==6.0.1
regex==2023.12.25
requests==2.31.0
requests-oauthlib==1.3.1
responses==0.18.0
retina-face==0.0.14
rich==13.7.0
rouge-score==0.1.2
rsa==4.9
sacrebleu==1.5.0
safetensors==0.4.2
schema==0.7.5
scikit-learn==1.4.1.post1
scipy==1.12.0
sentencepiece==0.2.0
shortuuid==1.0.11
six==1.16.0
sniffio==1.3.0
soupsieve==2.5
sqlitedict==2.1.0
starlette==0.36.3
svgwrite==1.4.3
sympy==1.12
tabledata==1.3.3
tcolorpy==0.1.4
tensorboard==2.15.2
tensorboard-data-server==0.7.2
tensorflow==2.15.0.post1
tensorflow-estimator==2.15.0
tensorflow-io-gcs-filesystem==0.36.0
termcolor==2.4.0
threadpoolctl==3.3.0
tiktoken==0.4.0
tokenizers==0.15.2
torch==2.1.0+cpu
torchaudio==2.1.0+cpu
tqdm==4.66.2
tqdm-multiprocess==0.0.11
transformers==4.38.1
typepy==1.3.2
typing_extensions==4.9.0
tzdata==2024.1
urllib3==2.2.1
uvicorn==0.27.1
wavedrom==2.0.3.post3
wcwidth==0.2.13
Werkzeug==3.0.1
wrapt==1.14.1
xxhash==3.4.1
yacs==0.1.8
yarl==1.9.4
zstandard==0.22.0
letonghan commented 8 months ago

Hi, the error of RuntimeError: Chatbot instance has not been set. indicates that the chatbot instance has not been correctly initialized when starting neuralchat server. Do you have the complete log of neuralchat_server start?

regmibijay commented 8 months ago

Hi @letonghan looks like I overlooked an incompatibility with given model and logs show

Loading model Intel/gpt-j-6B-int8-dynamic-inc
2024-03-01 22:25:38,785 - root - ERROR - Exception: Intel/gpt-j-6B-int8-dynamic-inc does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack.
2024-03-01 22:25:38 [ERROR] neuralchat error: Generic error
Loading config settings from the environment...
INFO:     Started server process [277377]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:6900 (Press CTRL+C to quit)

I assume this is expected and hence closing the issue.