FMInference / FlexLLMGen

Running large language models on a single GPU for throughput-oriented scenarios.
Apache License 2.0
9.21k stars 549 forks source link

Question about the num-gpu-batches and gpu-batch-size #98

Open young-chao opened 1 year ago

young-chao commented 1 year ago

According to batch_size_table.md, from 144=48 x 3 (144 from batch_size_table.md and 48 x 3 from bench_suite.py) I can think that batch-size is composed of num-gpu-batches and gpu-batch-size together in FlexGen. But I don't understand the actual meaning of these two parameters. Shouldn't num-gpu-batches be the number of batches? and gpu-batch-size is the batch-size. image

image