Closed marianasignal closed 2 months ago
Very good work, is there any way to speed up the reasoning? For example assembling a batch for batch inference?
Thanks for your interest!
As you metioned, sample batching speed up inference. The method is here.
Another method is no-gradient inference. This docs will help you.
Very good work, is there any way to speed up the reasoning? For example assembling a batch for batch inference?