Closed akhoroshev closed 1 week ago
@MartinMarciniszyn @Shixiaowei02 Could you please help to comment on this issue? Thanks.
The batched logits postprocessor has this signature:
using LogitsPostProcessorBatched = std::function<void(std::vector<IdType> const&, std::vector<Tensor>&,
std::vector<std::reference_wrapper<BeamTokens const>> const&, StreamPtr const&)>;
See types.h for details.
@MartinMarciniszyn batch_manager does not accept such signature. Your link from Executor API not from batch_manager API.
Setting the batched logits processor is not exposed on GptManager
. Please use the Executor API for this functionality.
batch_manager::GenericLlmRequest
has logitsPostProcessor with typestd::function<void(RequestIdType, TensorPtr&, BeamTokens const&, TStream const&)>;
and mApplyLogitsPostProcessorBatched optionHow can this type of callback handle a batch of requests?