Closed mymynew closed 1 year ago
你好,请提供一下具体运行的例子,命令以及全部的报错信息
一般来说,使用generate的时候如果use_fast=True 会触发Faster Transformers自动编译
@sijunhe 部署的是examples/code_generation/codegen下的codegen-2B-nl模型,就python codegen_server.py启动的use_fast=True。 没有看到自动编译的日志输出,困扰了很多天了。
/home/xxwork/paddlenlp python codegen_server.py
grep: warning: GREP_OPTIONS is deprecated; please use an alias or script
/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/image_utils.py:213: DeprecationWarning: BILINEAR is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BILINEAR instead.
resample=Image.BILINEAR,
/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/image_utils.py:379: DeprecationWarning: NEAREST is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.NEAREST or Dither.NONE instead.
resample=Image.NEAREST,
/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/ernie_vil/feature_extraction.py:65: DeprecationWarning: BICUBIC is deprecated and will be remov ed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead.
resample=Image.BICUBIC,
/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/clip/feature_extraction.py:64: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead.
resample=Image.BICUBIC,
[2023-06-14 16:32:37,158] [ INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/vocab.json
[2023-06-14 16:32:37,158] [ INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/merges.txt
[2023-06-14 16:32:37,158] [ INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/added_tokens.json
[2023-06-14 16:32:37,159] [ INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/special_tokens_map.json
[2023-06-14 16:32:37,159] [ INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/tokenizer_config.json
[2023-06-14 16:32:37,271] [ INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/model_state.pdparams
[2023-06-14 16:32:37,271] [ INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/model_config.json
W0614 16:32:37.273700 590 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 6.0, Driver API Version: 11.4, Runtime API Version: 11.2
W0614 16:32:37.279552 590 gpu_resources.cc:91] device: 0, cuDNN Version: 8.1.
[2023-06-14 16:33:59,768] [ INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/vocab.json
[2023-06-14 16:33:59,768] [ INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/merges.txt
[2023-06-14 16:33:59,769] [ INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/added_tokens.json
[2023-06-14 16:33:59,769] [ INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/special_tokens_map.json
[2023-06-14 16:33:59,769] [ INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/tokenizer_config.json
[2023-06-14 16:33:59,879] [ INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/model_state.pdparams
[2023-06-14 16:33:59,880] [ INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/model_config.json
INFO: Started server process [590]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8978 (Press CTRL+C to quit)
[2023-06-14 16:38:24,952] [ INFO] - Request: {'prompt':
[2023-06-14 16:38:24,952] [ INFO] - Start generating code
INFO: 10.11.5.54:56799 - "POST /v1/engines/codegen/completions HTTP/1.1" 500 Internal Server Error
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/uvicorn/protocols/http/h11_impl.py", line 366, in run_asgi
result = await app(self.scope, self.receive, self.send)
File "/usr/local/lib/python3.7/dist-packages/uvicorn/middleware/proxy_headers.py", line 75, in __call__
return await self.app(scope, receive, send)
File "/usr/local/lib/python3.7/dist-packages/fastapi/applications.py", line 269, in __call__
await super().__call__(scope, receive, send)
File "/usr/local/lib/python3.7/dist-packages/starlette/applications.py", line 124, in __call__
await self.middleware_stack(scope, receive, send)
File "/usr/local/lib/python3.7/dist-packages/starlette/middleware/errors.py", line 184, in __call__
raise exc
File "/usr/local/lib/python3.7/dist-packages/starlette/middleware/errors.py", line 162, in __call__
await self.app(scope, receive, _send)
File "/usr/local/lib/python3.7/dist-packages/starlette/exceptions.py", line 93, in __call__
raise exc
File "/usr/local/lib/python3.7/dist-packages/starlette/exceptions.py", line 82, in __call__
await self.app(scope, receive, sender)
File "/usr/local/lib/python3.7/dist-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
raise e
File "/usr/local/lib/python3.7/dist-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
await self.app(scope, receive, send)
File "/usr/local/lib/python3.7/dist-packages/starlette/routing.py", line 670, in __call__
await route.handle(scope, receive, send)
File "/usr/local/lib/python3.7/dist-packages/starlette/routing.py", line 266, in handle
await self.app(scope, receive, send)
File "/usr/local/lib/python3.7/dist-packages/starlette/routing.py", line 65, in app
response = await func(request)
File "/usr/local/lib/python3.7/dist-packages/fastapi/routing.py", line 228, in app
dependant=dependant, values=values, is_coroutine=is_coroutine
File "/usr/local/lib/python3.7/dist-packages/fastapi/routing.py", line 160, in run_endpoint_function
return await dependant.call(**values)
File "/home/yzgwork/paddlenlp/codegen_server.py", line 101, in gen
use_fp16_decoding=generate_config.use_fp16_decoding,
File "/usr/local/lib/python3.7/dist-packages/decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/base.py", line 354, in _decorate_function
return func(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/generation_utils.py", line 888, in generate
**model_kwargs)
File "/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/generation_utils.py", line 962, in greedy_search
outputs = self(**model_inputs)
File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/layers.py", line 930, in __call__
return self._dygraph_call_func(*inputs, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
outputs = self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/codegen/modeling.py", line 619, in forward
cache=cache)
File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/layers.py", line 930, in __call__
return self._dygraph_call_func(*inputs, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
outputs = self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/codegen/modeling.py", line 496, in forward
cache=old_cache)
File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/layers.py", line 930, in __call__
return self._dygraph_call_func(*inputs, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
outputs = self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/codegen/modeling.py", line 262, in forward
use_cache=use_cache)
File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/layers.py", line 930, in __call__
return self._dygraph_call_func(*inputs, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
outputs = self.forward(*inputs, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/codegen/modeling.py", line 183, in forward
k_rot = apply_rotary_pos_emb(k_rot, sincos, offset=offset)
File "/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/codegen/modeling.py", line 53, in apply_rotary_pos_emb
return (x * cos) + (rotate_every_two(x) * sin)
File "/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/codegen/modeling.py", line 37, in rotate_every_two
x1 = x[:, :, :, ::2]
File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/varbase_patch_methods.py", line 740, in __getitem__
return self._getitem_index_not_tensor(item)
RuntimeError: (NotFound) There are no kernels which are registered in the strided_slice operator.
[Hint: Expected kernels_iter != all_op_kernels.end(), but received kernels_iter == all_op_kernels.end().] (at /paddle/paddle/fluid/imperative/prepared_operator.cc:341)
[operator < strided_slice > error]
在registry.baidubce.com/paddlepaddle/paddle:2.3.2-gpu-cuda11.2-cudnn8镜像的基础上,装的nlp Dockerfile
# 百度nlp模型运行环境
FROM registry.baidubce.com/paddlepaddle/paddle:2.3.2-gpu-cuda11.2-cudnn8
# FROM paddlecloud/paddlenlp:develop-gpu-cuda11.2-cudnn8-latest
# 构建出的镜像:paddlecloud/paddlenlp:develop-gpu-cuda11.2-cudnn8-latest-e
# RUN pip install -i https://mirrors.aliyun.com/pypi/simple setuptools-scm
RUN pip install -i https://mirrors.aliyun.com/pypi/simple PaddleNLP
RUN pip install -i https://mirrors.aliyun.com/pypi/simple openai==0.8.0
RUN pip install -i https://mirrors.aliyun.com/pypi/simple fastapi==0.79.0
RUN pip install -i https://mirrors.aliyun.com/pypi/simple pydantic==1.9.1
RUN pip install -i https://mirrors.aliyun.com/pypi/simple python-dotenv==0.20.0
RUN pip install -i https://mirrors.aliyun.com/pypi/simple sse_starlette==0.10.3
RUN pip install -i https://mirrors.aliyun.com/pypi/simple uvicorn==0.17.6
在registry.baidubce.com/paddlepaddle/paddle:2.3.2-gpu-cuda11.2-cudnn8镜像的基础上,装的nlp Dockerfile
# 百度nlp模型运行环境 FROM registry.baidubce.com/paddlepaddle/paddle:2.3.2-gpu-cuda11.2-cudnn8 # FROM paddlecloud/paddlenlp:develop-gpu-cuda11.2-cudnn8-latest # 构建出的镜像:paddlecloud/paddlenlp:develop-gpu-cuda11.2-cudnn8-latest-e # RUN pip install -i https://mirrors.aliyun.com/pypi/simple setuptools-scm RUN pip install -i https://mirrors.aliyun.com/pypi/simple PaddleNLP RUN pip install -i https://mirrors.aliyun.com/pypi/simple openai==0.8.0 RUN pip install -i https://mirrors.aliyun.com/pypi/simple fastapi==0.79.0 RUN pip install -i https://mirrors.aliyun.com/pypi/simple pydantic==1.9.1 RUN pip install -i https://mirrors.aliyun.com/pypi/simple python-dotenv==0.20.0 RUN pip install -i https://mirrors.aliyun.com/pypi/simple sse_starlette==0.10.3 RUN pip install -i https://mirrors.aliyun.com/pypi/simple uvicorn==0.17.6
你好,可以升级paddle到2.4.2哈
升级到paddlepaddle2.4.2后use_fast=True可以运行了
请提出你的问题
在官方paddlepaddle 2.3.2镜像的基础上pip install paddlenlp==2.4.0,运行代码生成模型的例子时,报
RuntimeError: (NotFound) There are no kernels which are registered in the strided_slice operator. [Hint: Expected kernels_iter != all_op_kernels.end(), but received kernels_iter == all_op_kernels.end().] (at /paddle/paddle/fluid/imperative/prepared_operator.cc:341) [operator < strided_slice > error]
pip install方式的paddlenlp需要手动编译fast_transformer吗? 如果需要,该怎么操作?