xusenlinzy / api-for-open-llm

Openai style api for open large language models, using LLMs just as chatgpt! Support for LLaMA, LLaMA-2, BLOOM, Falcon, Baichuan, Qwen, Xverse, SqlCoder, CodeLLaMA, ChatGLM, ChatGLM2, ChatGLM3 etc. 开源大模型的统一后端接口
Apache License 2.0
2.35k stars 270 forks source link

使用Qwen2-7B-Instrut模型出现问题-使用Vllm #303

Closed Empress7211 closed 2 months ago

Empress7211 commented 3 months ago

提交前必须检查以下项目 | The following items must be checked before submission

问题类型 | Type of problem

模型推理和部署 | Model inference and deployment

操作系统 | Operating system

Linux

详细描述问题 | Detailed description of the problem

retry_cnt = 0
    while retry_cnt <= retries:
        try:
            print("进入 chat.completions 接口:\n")
            response = client.chat.completions.create(
                model=engine,
                messages=[
                    {"role": "system", "content": "You are a helpful assistant."},
                    {"role": "user", "content": prompts},
                ],
                max_tokens=target_length,
                temperature=temperature,
                top_p=top_p,
                frequency_penalty=frequency_penalty,
                presence_penalty=presence_penalty,
                stop=stop_sequences,
                n=n,
                # best_of=best_of,
                # stream=True,
            )
            break
        except openai.OpenAIError as e:  # 确保正确处理 OpenAIError
            print(f"OpenAIError: {e}.")
            if "Please reduce your prompt" in str(e):
                target_length = int(target_length * 0.8)
                print(f"Reducing target length to {target_length}, retrying...")
            else:
                print(f"Retrying in {backoff_time} seconds...")
                time.sleep(backoff_time)
                backoff_time *= 1.5
            retry_cnt += 1

Dependencies

peft                              0.12.0
sentence-transformers             3.0.1
torch                             2.4.0
torch-struct                      0.5
torchvision                       0.19.0
transformers                      4.43.4
transformers-stream-generator     0.0.5

运行日志或截图 | Runtime logs or screenshots

## 这是运行程序端的错误代码
进入 chat.completions 接口:

OpenAIError: Internal Server Error.
Retrying in 1 seconds...
进入 chat.completions 接口:

OpenAIError: Internal Server Error.
Retrying in 1.5 seconds...
进入 chat.completions 接口:

OpenAIError: Internal Server Error.
Retrying in 2.25 seconds...
^CTraceback (most recent call last):
  File "/home/intern2/self-instruct-main/self_instruct/qwen2_api.py", line 47, in make_requests
    response = client.chat.completions.create(
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/openai/_utils/_utils.py", line 274, in wrapper
    return func(*args, **kwargs)
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/openai/resources/chat/completions.py", line 650, in create
    return self._post(
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/openai/_base_client.py", line 936, in request
    return self._request(
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/openai/_base_client.py", line 1025, in _request
    return self._retry_request(
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/openai/_base_client.py", line 1074, in _retry_request
    return self._request(
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/openai/_base_client.py", line 1025, in _request
    return self._retry_request(
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/openai/_base_client.py", line 1074, in _retry_request
    return self._request(
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/openai/_base_client.py", line 1040, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.InternalServerError: Internal Server Error

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/intern2/self-instruct-main/self_instruct/bootstrap_instructions.py", line 215, in <module>
    results = make_qwen2_requests(
  File "/home/intern2/self-instruct-main/self_instruct/qwen2_api.py", line 76, in make_requests

## 这是本项目打开sever服务后运行程序的相应日志
INFO:     127.0.0.1:49506 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/home/intern2/api-for-open-llm/api/templates/base.py", line 47, in convert_messages_to_ids
    token_ids = self._convert_messages_to_ids(
  File "/home/intern2/api-for-open-llm/api/templates/base.py", line 81, in _convert_messages_to_ids
    raise NotImplementedError
NotImplementedError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 399, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__
    return await self.app(scope, receive, send)
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/starlette/applications.py", line 123, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in __call__
    raise exc
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
    await self.app(scope, receive, send)
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 65, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/starlette/routing.py", line 756, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/starlette/routing.py", line 776, in app
    await route.handle(scope, receive, send)
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/starlette/routing.py", line 297, in handle
    await self.app(scope, receive, send)
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/starlette/routing.py", line 77, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/starlette/routing.py", line 72, in app
    response = await func(request)
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/fastapi/routing.py", line 278, in app
    raw_response = await run_endpoint_function(
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
  File "/home/intern2/api-for-open-llm/api/vllm_routes/chat.py", line 77, in create_chat_completion
    token_ids = engine.template.convert_messages_to_ids(
  File "/home/intern2/api-for-open-llm/api/templates/base.py", line 56, in convert_messages_to_ids
    token_ids = self.apply_chat_template(
  File "/home/intern2/api-for-open-llm/api/templates/base.py", line 90, in apply_chat_template
    return self.tokenizer.apply_chat_template(
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1833, in apply_chat_template
    rendered_chat = compiled_template.render(
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/jinja2/environment.py", line 1304, in render
    self.environment.handle_exception()
  File "/home/intern2/anaconda3/envs/self_instruct/lib/python3.10/site-packages/jinja2/environment.py", line 939, in handle_exception
    raise rewrite_traceback_stack(source=source)
  File "<template>", line 4, in top-level template code
TypeError: can only concatenate str (not "list") to str
Empress7211 commented 3 months ago

.env文件:

model related

MODEL_NAME=qwen2 MODEL_PATH=/home/intern2/model/qwen/Qwen2-7B-Instruct PROMPT_NAME=qwen2

rag related

EMBEDDING_NAME=

RERANK_NAME=

api related

API_PREFIX=/v1

vllm related

ENGINE=vllm TRUST_REMOTE_CODE=true TOKENIZE_MODE=auto TENSOR_PARALLEL_SIZE=1 GPUS=2 # 使用纯数字表示GPU ID NUM_GPUs=1 DTYPE=auto

TASKS=llm

TASKS=llm,rag

csyhhu commented 2 months ago

@Empress7211 请问这个问题解决了吗?

Empress7211 commented 1 month ago

@Empress7211 请问这个问题解决了吗?

您好,我后面审查代码时,应该是由于其他文件引起的,与本项目无关,问题已经得到解决了。