We have implemented a custom postprocessing step in beam search decoding where we filter some outputs out of the final beam output. In a case where we are left with only one output, we expect the response to be a list of strings of length 1. But instead in this case we see the response is a string. This is leading to issues in the client for parsing the response.
For example, output with more than one suggestion:
curl -X POST localhost:8000/v2/models/tensorrt_llm_bls/generate -d '{"input":"nik"}'
{"model_name":"tensorrt_llm_bls","model_version":"1","output":["nike socks","nike sweatpants","nike sweatshirt","nike hoodie","nike womens sneakers","nike sweatpants for men","nike womens sweatpants","nike sweatshirt men","nike shoe laces","nike air max 270 men"]}%
We see that output is a list of strings, however if we filter the output and in some cases the number of outputs is one. We see a string output.
curl -X POST localhost:8000/v2/models/tensorrt_llm_bls/generate -d '{"input":"adult t shirts with dogs on them"}'
{"model_name":"tensorrt_llm_bls","model_version":"1","output":"adult t shirts with dogs on them"}%
In this case the output is a string rather than a list of string ie we expect ["adult t shirts with dogs on them"] and not "adult t shirts with dogs on them"
We have implemented a custom postprocessing step in beam search decoding where we filter some outputs out of the final beam output. In a case where we are left with only one output, we expect the response to be a list of strings of length 1. But instead in this case we see the response is a string. This is leading to issues in the client for parsing the response.
For example, output with more than one suggestion:
We see that output is a list of strings, however if we filter the output and in some cases the number of outputs is one. We see a string output.
In this case the output is a string rather than a list of string ie we expect
["adult t shirts with dogs on them"]
and not"adult t shirts with dogs on them"
How do we ensure the output is a list everytime?
config.pbtxt for
tensorrt_llm_bls