How to terminate a grpc streaming request immediately during tritonserver inference with a FasterTransformer backend?

In a production environment like ChatGPT, early termination of a conversation based on user-client commands can be a major requirement. I'm wondering whether a grpc streaming request can be terminated immediately during tritonserver inference with a FasterTransformer backend? Could you please give some advice?

with grpcclient.InferenceServerClient(self.model_url) as client:
        client.start_stream(callback=partial(stream_callback, result_queue))
        client.async_stream_infer(self.model_name, request_data)

triton-inference-server / fastertransformer_backend

How to terminate a grpc streaming request immediately during tritonserver inference with a FasterTransformer backend? #139