Is your feature request related to a problem? Please describe.
When using mindnlp to infer GPT2, I found that the inference speed is 10X slower than pytorch.
Here is the torch version implementation: https://github.com/graykode/gpt-2-Pytorch
The hardware I use is Nvidia V100.
MindSpore version: 2.2.12, 2.1.1
Pytorch version: 2.2.0
args I use: ms_dtype=mindspore.float16, use_cache=True
Describe the solution you'd like
A clear and concise description of what you want to happen.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
Is your feature request related to a problem? Please describe. When using mindnlp to infer GPT2, I found that the inference speed is 10X slower than pytorch. Here is the torch version implementation: https://github.com/graykode/gpt-2-Pytorch
The hardware I use is Nvidia V100. MindSpore version: 2.2.12, 2.1.1 Pytorch version: 2.2.0 args I use: ms_dtype=mindspore.float16, use_cache=True
Describe the solution you'd like A clear and concise description of what you want to happen.
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Additional context Add any other context or screenshots about the feature request here.