huggingface / transformers-bloom-inference

Fast Inference Solutions for BLOOM
Apache License 2.0
561 stars 114 forks source link

Incorrectly benchmarking #72

Open JoeyTPChou opened 1 year ago

JoeyTPChou commented 1 year ago

All 3 scripts under bloom-inference-scripts incorrectly benchmark the t_generate_span time. The t_generate_span is got from the first generate() call at here https://github.com/huggingface/transformers-bloom-inference/blob/main/bloom-inference-scripts/bloom-ds-inference.py#L257 instead of in the benchmark cycle.