Open Muxv opened 2 months ago
In TensorRT-LLM, it is possible to integrate a LogitsProcessor during model inference to control the behavior of the inference process. Is it feasible to add a similar interface in the tensorrtllm backend to implement LogitsProcessor?
Currently, TRT LLM backend does not support such requirement.
In TensorRT-LLM, it is possible to integrate a LogitsProcessor during model inference to control the behavior of the inference process. Is it feasible to add a similar interface in the tensorrtllm backend to implement LogitsProcessor?