为什么转换后的HDF5模型，推理时间反而比Hugging Face慢？

bytedance / lightseq

LightSeq: A High Performance Library for Sequence Processing and Generation

Other

3.22k stars 329 forks source link

Open DidaDidaDidaD opened 2 years ago

DidaDidaDidaD commented 2 years ago

为什么转换后的HDF5模型，推理时间反而比Hugging Face慢？原本0.24妙推理一个句子，转换模型后反而到了0.33

neopro12 commented 2 years ago

Maybe your GPU doesn't support tensorcore for fp16, you can try to build LightSeq with fp32 mode: ENABLE_FP32=1 pip3 install -e $PROJECT_DI