kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
Apache License 2.0
653 stars 31 forks source link

UnboundLocalError: cannot access local variable 'chunck_mask' where it is not associated with a value #70

Closed fengyang95 closed 2 weeks ago

fengyang95 commented 2 weeks ago

configuration: /DeepSeek-V2-Chat-multi-gpu-4.yaml When the input prompt is relatively long, the following error appears:

2024-09-01 01:52:15,535 DEBUG /home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/ktransformers/server/backend/interfaces/transformers.py[240]: input_ids: torch.Size([1, 1982])
2024-09-01 01:52:15,536 DEBUG /home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/ktransformers/server/backend/interfaces/transformers.py[262]: cache position: 0 to 1982
INFO:     127.0.0.1:59126 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/starlette/applications.py", line 123, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__
    raise exc
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/starlette/middleware/cors.py", line 85, in __call__
    await self.app(scope, receive, send)
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 65, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/starlette/routing.py", line 754, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/starlette/routing.py", line 774, in app
    await route.handle(scope, receive, send)
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/starlette/routing.py", line 295, in handle
    await self.app(scope, receive, send)
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/starlette/routing.py", line 77, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/starlette/routing.py", line 74, in app
    response = await f(request)
               ^^^^^^^^^^^^^^^^
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/fastapi/routing.py", line 297, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/fastapi/routing.py", line 210, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/ktransformers/server/api/openai/endpoints/chat.py", line 32, in chat_completion
    async for token in interface.inference(input_message,id):
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/ktransformers/server/backend/interfaces/transformers.py", line 323, in inference
    for t in self.prefill(input_ids,self.check_is_new(thread_id)):
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 36, in generator_context
    response = gen.send(None)
               ^^^^^^^^^^^^^^
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/ktransformers/server/backend/interfaces/transformers.py", line 272, in prefill
    logits = self.model(
             ^^^^^^^^^^^
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/ktransformers/models/modeling_deepseek.py", line 1731, in forward
    outputs = self.model(
              ^^^^^^^^^^^
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/ktransformers/operators/models.py", line 719, in forward
    layer_outputs = decoder_layer(
                    ^^^^^^^^^^^^^^
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/ktransformers/models/modeling_deepseek.py", line 1238, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
                                                          ^^^^^^^^^^^^^^^
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tiger/.pyenv/versions/3.11.2/lib/python3.11/site-packages/ktransformers/operators/attention.py", line 202, in forward
    chunck_mask,
    ^^^^^^^^^^^
UnboundLocalError: cannot access local variable 'chunck_mask' where it is not associated with a value
sayap commented 2 weeks ago

Need to change chunk_mask in line 186 to chunck_mask (e.g. see https://github.com/sayap/ktransformers/commit/b38e49220d)

Azure-Tang commented 2 weeks ago

It seems caused by my typo orz, Fixed by #71