Closed yuanlehome closed 2 days ago
Thanks for your contribution!
Attention: Patch coverage is 0%
with 2 lines
in your changes missing coverage. Please review.
Project coverage is 55.80%. Comparing base (
65e721e
) to head (4821cd6
). Report is 6 commits behind head on develop.
Files | Patch % | Lines |
---|---|---|
...dlenlp/experimental/transformers/llama/modeling.py | 0.00% | 2 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
如讨论,目前llama3模型,在动态图非fuse场景下推理正常,在fuse场景下推理存在多进程问题。待后续排查。另外动转静时不可以设置src_length进行推理,以及高性能推理下无法正确eos。 @yuanlehome
PR types
New features
PR changes
Others
Description
inference support llama3(wint8|4/a8w8)