This will increment the number of batches dynamically when you call prefill, and it will reduce the number of batches only when prefill is called again.
The intention is to avoid useless recompilation (keeping batch size the same as long as possible).
What does this PR do?
This will increment the number of batches dynamically when you call prefill, and it will reduce the number of batches only when prefill is called again. The intention is to avoid useless recompilation (keeping batch size the same as long as possible).