issues
search
flexflow
/
FlexFlow
FlexFlow Serve: Low-Latency, High-Performance LLM Serving
https://flexflow.readthedocs.io
Apache License 2.0
1.71k
stars
229
forks
source link
minimum missing components for different LLMs
#694
Open
xinhaoc
opened
1 year ago
xinhaoc
commented
1 year ago
OPT:
[x] a specific positional embedding for opt
[x] a bias support in linear layer
[ ] some masks inside attention layer
Bloom:
[ ] a permute kernel
[ ] some changes inside attention layer.
xinhaoc
commented
1 year ago
opt weights and license
OPT:
Bloom: