[Question]: pip install的PaddleNLP需要自己编译fast_transformer么？

mymynew commented 1 year ago

请提出你的问题

在官方paddlepaddle 2.3.2镜像的基础上pip install paddlenlp==2.4.0，运行代码生成模型的例子时，报 RuntimeError: (NotFound) There are no kernels which are registered in the strided_slice operator. [Hint: Expected kernels_iter != all_op_kernels.end(), but received kernels_iter == all_op_kernels.end().] (at /paddle/paddle/fluid/imperative/prepared_operator.cc:341) [operator < strided_slice > error]

pip install方式的paddlenlp需要手动编译fast_transformer吗？如果需要，该怎么操作？

sijunhe commented 1 year ago

你好，请提供一下具体运行的例子，命令以及全部的报错信息

sijunhe commented 1 year ago

一般来说，使用generate的时候如果use_fast=True 会触发Faster Transformers自动编译

mymynew commented 1 year ago

@sijunhe 部署的是examples/code_generation/codegen下的codegen-2B-nl模型，就python codegen_server.py启动的use_fast=True。没有看到自动编译的日志输出，困扰了很多天了。

/home/xxwork/paddlenlp python codegen_server.py
grep: warning: GREP_OPTIONS is deprecated; please use an alias or script
/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/image_utils.py:213: DeprecationWarning: BILINEAR is deprecated and will be removed in Pillow 10                                                                                 (2023-07-01). Use Resampling.BILINEAR instead.
  resample=Image.BILINEAR,
/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/image_utils.py:379: DeprecationWarning: NEAREST is deprecated and will be removed in Pillow 10                                                                                 (2023-07-01). Use Resampling.NEAREST or Dither.NONE instead.
  resample=Image.NEAREST,
/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/ernie_vil/feature_extraction.py:65: DeprecationWarning: BICUBIC is deprecated and will be remov                                                                                ed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead.
  resample=Image.BICUBIC,
/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/clip/feature_extraction.py:64: DeprecationWarning: BICUBIC is deprecated and will be removed in                                                                                 Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead.
  resample=Image.BICUBIC,
[2023-06-14 16:32:37,158] [    INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/vocab.json
[2023-06-14 16:32:37,158] [    INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/merges.txt
[2023-06-14 16:32:37,158] [    INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/added_tokens.json
[2023-06-14 16:32:37,159] [    INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/special_tokens_map.json
[2023-06-14 16:32:37,159] [    INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/tokenizer_config.json
[2023-06-14 16:32:37,271] [    INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/model_state.pdparams
[2023-06-14 16:32:37,271] [    INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/model_config.json
W0614 16:32:37.273700   590 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 6.0, Driver API Version: 11.4, Runtime API Version: 11.2
W0614 16:32:37.279552   590 gpu_resources.cc:91] device: 0, cuDNN Version: 8.1.
[2023-06-14 16:33:59,768] [    INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/vocab.json
[2023-06-14 16:33:59,768] [    INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/merges.txt
[2023-06-14 16:33:59,769] [    INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/added_tokens.json
[2023-06-14 16:33:59,769] [    INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/special_tokens_map.json
[2023-06-14 16:33:59,769] [    INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/tokenizer_config.json
[2023-06-14 16:33:59,879] [    INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/model_state.pdparams
[2023-06-14 16:33:59,880] [    INFO] - Already cached /root/.paddlenlp/models/Salesforce/codegen-2B-nl/model_config.json
INFO:     Started server process [590]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8978 (Press CTRL+C to quit)
[2023-06-14 16:38:24,952] [    INFO] - Request: {'prompt': 
[2023-06-14 16:38:24,952] [    INFO] - Start generating code
INFO:     10.11.5.54:56799 - "POST /v1/engines/codegen/completions HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/uvicorn/protocols/http/h11_impl.py", line 366, in run_asgi
    result = await app(self.scope, self.receive, self.send)
  File "/usr/local/lib/python3.7/dist-packages/uvicorn/middleware/proxy_headers.py", line 75, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.7/dist-packages/fastapi/applications.py", line 269, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.7/dist-packages/starlette/applications.py", line 124, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.7/dist-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/usr/local/lib/python3.7/dist-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/usr/local/lib/python3.7/dist-packages/starlette/exceptions.py", line 93, in __call__
    raise exc
  File "/usr/local/lib/python3.7/dist-packages/starlette/exceptions.py", line 82, in __call__
    await self.app(scope, receive, sender)
  File "/usr/local/lib/python3.7/dist-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
    raise e
  File "/usr/local/lib/python3.7/dist-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.7/dist-packages/starlette/routing.py", line 670, in __call__
    await route.handle(scope, receive, send)
  File "/usr/local/lib/python3.7/dist-packages/starlette/routing.py", line 266, in handle
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.7/dist-packages/starlette/routing.py", line 65, in app
    response = await func(request)
  File "/usr/local/lib/python3.7/dist-packages/fastapi/routing.py", line 228, in app
    dependant=dependant, values=values, is_coroutine=is_coroutine
  File "/usr/local/lib/python3.7/dist-packages/fastapi/routing.py", line 160, in run_endpoint_function
    return await dependant.call(**values)
  File "/home/yzgwork/paddlenlp/codegen_server.py", line 101, in gen
    use_fp16_decoding=generate_config.use_fp16_decoding,
  File "/usr/local/lib/python3.7/dist-packages/decorator.py", line 232, in fun
    return caller(func, *(extras + args), **kw)
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/base.py", line 354, in _decorate_function
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/generation_utils.py", line 888, in generate
    **model_kwargs)
  File "/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/generation_utils.py", line 962, in greedy_search
    outputs = self(**model_inputs)
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/layers.py", line 930, in __call__
    return self._dygraph_call_func(*inputs, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
    outputs = self.forward(*inputs, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/codegen/modeling.py", line 619, in forward
    cache=cache)
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/layers.py", line 930, in __call__
    return self._dygraph_call_func(*inputs, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
    outputs = self.forward(*inputs, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/codegen/modeling.py", line 496, in forward
    cache=old_cache)
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/layers.py", line 930, in __call__
    return self._dygraph_call_func(*inputs, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
    outputs = self.forward(*inputs, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/codegen/modeling.py", line 262, in forward
    use_cache=use_cache)
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/layers.py", line 930, in __call__
    return self._dygraph_call_func(*inputs, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func
    outputs = self.forward(*inputs, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/codegen/modeling.py", line 183, in forward
    k_rot = apply_rotary_pos_emb(k_rot, sincos, offset=offset)
  File "/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/codegen/modeling.py", line 53, in apply_rotary_pos_emb
    return (x * cos) + (rotate_every_two(x) * sin)
  File "/usr/local/lib/python3.7/dist-packages/paddlenlp/transformers/codegen/modeling.py", line 37, in rotate_every_two
    x1 = x[:, :, :, ::2]
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dygraph/varbase_patch_methods.py", line 740, in __getitem__
    return self._getitem_index_not_tensor(item)
RuntimeError: (NotFound) There are no kernels which are registered in the strided_slice operator.
  [Hint: Expected kernels_iter != all_op_kernels.end(), but received kernels_iter == all_op_kernels.end().] (at /paddle/paddle/fluid/imperative/prepared_operator.cc:341)
  [operator < strided_slice > error]

mymynew commented 1 year ago

在registry.baidubce.com/paddlepaddle/paddle:2.3.2-gpu-cuda11.2-cudnn8镜像的基础上，装的nlp Dockerfile

# 百度nlp模型运行环境
FROM registry.baidubce.com/paddlepaddle/paddle:2.3.2-gpu-cuda11.2-cudnn8
# FROM paddlecloud/paddlenlp:develop-gpu-cuda11.2-cudnn8-latest
# 构建出的镜像：paddlecloud/paddlenlp:develop-gpu-cuda11.2-cudnn8-latest-e

# RUN pip install -i https://mirrors.aliyun.com/pypi/simple setuptools-scm
RUN pip install -i https://mirrors.aliyun.com/pypi/simple PaddleNLP
RUN pip install -i https://mirrors.aliyun.com/pypi/simple openai==0.8.0
RUN pip install -i https://mirrors.aliyun.com/pypi/simple fastapi==0.79.0
RUN pip install -i https://mirrors.aliyun.com/pypi/simple pydantic==1.9.1
RUN pip install -i https://mirrors.aliyun.com/pypi/simple python-dotenv==0.20.0
RUN pip install -i https://mirrors.aliyun.com/pypi/simple sse_starlette==0.10.3
RUN pip install -i https://mirrors.aliyun.com/pypi/simple uvicorn==0.17.6

gongel commented 1 year ago

在registry.baidubce.com/paddlepaddle/paddle:2.3.2-gpu-cuda11.2-cudnn8镜像的基础上，装的nlp Dockerfile

# 百度nlp模型运行环境
FROM registry.baidubce.com/paddlepaddle/paddle:2.3.2-gpu-cuda11.2-cudnn8
# FROM paddlecloud/paddlenlp:develop-gpu-cuda11.2-cudnn8-latest
# 构建出的镜像：paddlecloud/paddlenlp:develop-gpu-cuda11.2-cudnn8-latest-e

# RUN pip install -i https://mirrors.aliyun.com/pypi/simple setuptools-scm
RUN pip install -i https://mirrors.aliyun.com/pypi/simple PaddleNLP
RUN pip install -i https://mirrors.aliyun.com/pypi/simple openai==0.8.0
RUN pip install -i https://mirrors.aliyun.com/pypi/simple fastapi==0.79.0
RUN pip install -i https://mirrors.aliyun.com/pypi/simple pydantic==1.9.1
RUN pip install -i https://mirrors.aliyun.com/pypi/simple python-dotenv==0.20.0
RUN pip install -i https://mirrors.aliyun.com/pypi/simple sse_starlette==0.10.3
RUN pip install -i https://mirrors.aliyun.com/pypi/simple uvicorn==0.17.6

你好，可以升级paddle到2.4.2哈

mymynew commented 1 year ago

升级到paddlepaddle2.4.2后use_fast=True可以运行了

PaddlePaddle / PaddleNLP

[Question]: pip install的PaddleNLP需要自己编译fast_transformer么？ #6174

请提出你的问题