intel / xFasterTransformer

Apache License 2.0
344 stars 60 forks source link

performance issue for opt-1.3b with BS=1 BF16 #339

Open bin1guo opened 4 months ago

bin1guo commented 4 months ago

test opt 1.3 model on EMR platform with 52c. the performance is not right with BS=1. the gap between BS=1 and BS =2 is too big.

numactl -C 0-51 -m 0 ./run_benchmark.sh -m opt-1_3b -d bf16 -s 1 -bs 1 -in 128 -out 15 -i 10

the results BS=1 image

BS=2 image

pujiang2018 commented 3 months ago

@bin1guo do we still need to benchmark OPT model? Suggest to run the llama model.