Open activezhao opened 3 months ago
It's not a bug, its a limitation of the tokenizer. Some character need two token_ids to represent, you have to decode them togather. I'm not sure whether the latest tensorrt_llm_bls
enhanced streaming decode, but I did a quick-fix workround by decoding token_ids on client side.
It's not a bug, its a limitation of the tokenizer. Some character need two token_ids to represent, you have to decode them togather. I'm not sure whether the latest
tensorrt_llm_bls
enhanced streaming decode, but I did a quick-fix workround by decoding token_ids on client side.
Hi @handoku Could you please tell me how to solve this problem?
In fact, I have found someone also met this problem, and they suggested to use bls to solve it, but I do not know the solution's detail.
Thanks so much.
It's not a bug, its a limitation of the tokenizer. Some character need two token_ids to represent, you have to decode them togather. I'm not sure whether the latest
tensorrt_llm_bls
enhanced streaming decode, but I did a quick-fix workround by decoding token_ids on client side.Hi @handoku Could you please tell me how to solve this problem?
In fact, I have found someone also met this problem, and they suggested to use bls to solve it, but I do not know the solution's detail.
Thanks so much.
try set accumulate_tokens
following this, but it seems that it will decode all generated tokens again when every single new token_id shows up.
It's not a bug, its a limitation of the tokenizer. Some character need two token_ids to represent, you have to decode them togather. I'm not sure whether the latest
tensorrt_llm_bls
enhanced streaming decode, but I did a quick-fix workround by decoding token_ids on client side.Hi @handoku Could you please tell me how to solve this problem?
In fact, I have found someone also met this problem, and they suggested to use bls to solve it, but I do not know the solution's detail.
Thanks so much.
try set
accumulate_tokens
following this, but it seems that it will decode all generated tokens again when every single new token_id shows up.
@handoku Got it, thank u so much, I will try it.
It's not a bug, its a limitation of the tokenizer. Some character need two token_ids to represent, you have to decode them togather. I'm not sure whether the latest
tensorrt_llm_bls
enhanced streaming decode, but I did a quick-fix workround by decoding token_ids on client side.Hi @handoku Could you please tell me how to solve this problem? In fact, I have found someone also met this problem, and they suggested to use bls to solve it, but I do not know the solution's detail. Thanks so much.
try set
accumulate_tokens
following this, but it seems that it will decode all generated tokens again when every single new token_id shows up.
Hi @handoku I learned what the function of accumulate_tokens
is.
The BLS model has an optional parameter accumulate_tokens which can be used in streaming mode to call the postprocessing model with all accumulated tokens, instead of only one token. This might be necessary for certain tokenizers.
parameters: {
key: "accumulate_tokens"
value: {
string_value: "${accumulate_tokens}"
}
}
I set accumulate_tokens to true
, but it does not work, � still occurred.
Could you please give me more suggestions?
Thanks.
parameters: {
key: "accumulate_tokens"
value: {
string_value: "true"
}
}
have you tried seeding request with stream=false? to confirm that its a tokenizer's decoding issue or accuracy issue.
have you tried seeding request with stream=false? to confirm that its a tokenizer's decoding issue or accuracy issue.
@handoku Yes, if stream=false, the Chinese in the inference results will not be garbled.
But, I need to use streaming mode.
In fact, I tried accumulate_tokens mode, and the garbled character can be replaced, but in streaming mode, I have to give one normal word, not all the response every time.
data: {"context_logits":0.0,"cum_log_probs":-0.35588228702545168,"generation_logits":0.0,"model_name":"tensorrt_llm_bls","model_version":"1","output_log_probs":[-0.009281360544264317,-0.0006639180355705321,-0.0002992600784637034,-0.0067407069727778439,-0.000007033372639853042,-0.00038464312092401087,-0.0027929767966270448,-0.04418620467185974,-0.0034859071020036937,-0.0005986097385175526,-0.0004702720034401864,-0.0023520810063928367,-0.00009978315210901201,-0.0003041491436306387,-0.003698743646964431,-0.06340079754590988,-0.0034113232977688314,-0.00033312622690573335,-0.00041213183430954814,-0.0017900982638821006,-0.00021764023404102772,-0.0004938868223689497,-0.0001466381800128147,-0.0022674258798360826,-0.08003108203411102,-0.0024388942401856186,-0.00025841951719485223,-0.0005590689834207296,-0.00748544093221426,-0.00010121380910277367,-9.536747711536009e-7,-0.0005720701883547008,-0.001312699867412448,-0.049526430666446689,-0.002000842010602355,-0.00032489807927049696,-0.0005064100841991603,-0.008147197775542736,-0.0008537836838513613,-0.00040926961810328066,-0.0009865857427939773,-0.04962138459086418,-0.0016011294210329652,-0.0001255352544831112,-0.0002481649280525744,-0.0009321144898422062],"text_output":"3: \"颗粒剂\", 4: \"注射剂\", 5: \"口服散剂\", 6: \"滴丸剂\", 7: \"灌肠剂\", 8: \"�"}
data: {"context_logits":0.0,"cum_log_probs":-0.35588598251342776,"generation_logits":0.0,"model_name":"tensorrt_llm_bls","model_version":"1","output_log_probs":[-0.009281360544264317,-0.0006639180355705321,-0.0002992600784637034,-0.0067407069727778439,-0.000007033372639853042,-0.00038464312092401087,-0.0027929767966270448,-0.04418620467185974,-0.0034859071020036937,-0.0005986097385175526,-0.0004702720034401864,-0.0023520810063928367,-0.00009978315210901201,-0.0003041491436306387,-0.003698743646964431,-0.06340079754590988,-0.0034113232977688314,-0.00033312622690573335,-0.00041213183430954814,-0.0017900982638821006,-0.00021764023404102772,-0.0004938868223689497,-0.0001466381800128147,-0.0022674258798360826,-0.08003108203411102,-0.0024388942401856186,-0.00025841951719485223,-0.0005590689834207296,-0.00748544093221426,-0.00010121380910277367,-9.536747711536009e-7,-0.0005720701883547008,-0.001312699867412448,-0.049526430666446689,-0.002000842010602355,-0.00032489807927049696,-0.0005064100841991603,-0.008147197775542736,-0.0008537836838513613,-0.00040926961810328066,-0.0009865857427939773,-0.04962138459086418,-0.0016011294210329652,-0.0001255352544831112,-0.0002481649280525744,-0.0009321144898422062,-0.000003695494797284482],"text_output":"3: \"颗粒剂\", 4: \"注射剂\", 5: \"口服散剂\", 6: \"滴丸剂\", 7: \"灌肠剂\", 8: \"栓"}
data: {"context_logits":0.0,"cum_log_probs":-0.3562590479850769,"generation_logits":0.0,"model_name":"tensorrt_llm_bls","model_version":"1","output_log_probs":[-0.009281360544264317,-0.0006639180355705321,-0.0002992600784637034,-0.0067407069727778439,-0.000007033372639853042,-0.00038464312092401087,-0.0027929767966270448,-0.04418620467185974,-0.0034859071020036937,-0.0005986097385175526,-0.0004702720034401864,-0.0023520810063928367,-0.00009978315210901201,-0.0003041491436306387,-0.003698743646964431,-0.06340079754590988,-0.0034113232977688314,-0.00033312622690573335,-0.00041213183430954814,-0.0017900982638821006,-0.00021764023404102772,-0.0004938868223689497,-0.0001466381800128147,-0.0022674258798360826,-0.08003108203411102,-0.0024388942401856186,-0.00025841951719485223,-0.0005590689834207296,-0.00748544093221426,-0.00010121380910277367,-9.536747711536009e-7,-0.0005720701883547008,-0.001312699867412448,-0.049526430666446689,-0.002000842010602355,-0.00032489807927049696,-0.0005064100841991603,-0.008147197775542736,-0.0008537836838513613,-0.00040926961810328066,-0.0009865857427939773,-0.04962138459086418,-0.0016011294210329652,-0.0001255352544831112,-0.0002481649280525744,-0.0009321144898422062,-0.000003695494797284482,-0.0003730754542630166],"text_output":"3: \"颗粒剂\", 4: \"注射剂\", 5: \"口服散剂\", 6: \"滴丸剂\", 7: \"灌肠剂\", 8: \"栓剂"}
Do u have a better solution?
Thanks.
have you tried seeding request with stream=false? to confirm that its a tokenizer's decoding issue or accuracy issue.
@handoku Yes, if stream=false, the Chinese in the inference results will not be garbled.
But, I need to use streaming mode.
In fact, I tried accumulate_tokens mode, and the garbled character can be replaced, but in streaming mode, I have to give one normal word, not all the response every time.
data: {"context_logits":0.0,"cum_log_probs":-0.35588228702545168,"generation_logits":0.0,"model_name":"tensorrt_llm_bls","model_version":"1","output_log_probs":[-0.009281360544264317,-0.0006639180355705321,-0.0002992600784637034,-0.0067407069727778439,-0.000007033372639853042,-0.00038464312092401087,-0.0027929767966270448,-0.04418620467185974,-0.0034859071020036937,-0.0005986097385175526,-0.0004702720034401864,-0.0023520810063928367,-0.00009978315210901201,-0.0003041491436306387,-0.003698743646964431,-0.06340079754590988,-0.0034113232977688314,-0.00033312622690573335,-0.00041213183430954814,-0.0017900982638821006,-0.00021764023404102772,-0.0004938868223689497,-0.0001466381800128147,-0.0022674258798360826,-0.08003108203411102,-0.0024388942401856186,-0.00025841951719485223,-0.0005590689834207296,-0.00748544093221426,-0.00010121380910277367,-9.536747711536009e-7,-0.0005720701883547008,-0.001312699867412448,-0.049526430666446689,-0.002000842010602355,-0.00032489807927049696,-0.0005064100841991603,-0.008147197775542736,-0.0008537836838513613,-0.00040926961810328066,-0.0009865857427939773,-0.04962138459086418,-0.0016011294210329652,-0.0001255352544831112,-0.0002481649280525744,-0.0009321144898422062],"text_output":"3: \"颗粒剂\", 4: \"注射剂\", 5: \"口服散剂\", 6: \"滴丸剂\", 7: \"灌肠剂\", 8: \"�"} data: {"context_logits":0.0,"cum_log_probs":-0.35588598251342776,"generation_logits":0.0,"model_name":"tensorrt_llm_bls","model_version":"1","output_log_probs":[-0.009281360544264317,-0.0006639180355705321,-0.0002992600784637034,-0.0067407069727778439,-0.000007033372639853042,-0.00038464312092401087,-0.0027929767966270448,-0.04418620467185974,-0.0034859071020036937,-0.0005986097385175526,-0.0004702720034401864,-0.0023520810063928367,-0.00009978315210901201,-0.0003041491436306387,-0.003698743646964431,-0.06340079754590988,-0.0034113232977688314,-0.00033312622690573335,-0.00041213183430954814,-0.0017900982638821006,-0.00021764023404102772,-0.0004938868223689497,-0.0001466381800128147,-0.0022674258798360826,-0.08003108203411102,-0.0024388942401856186,-0.00025841951719485223,-0.0005590689834207296,-0.00748544093221426,-0.00010121380910277367,-9.536747711536009e-7,-0.0005720701883547008,-0.001312699867412448,-0.049526430666446689,-0.002000842010602355,-0.00032489807927049696,-0.0005064100841991603,-0.008147197775542736,-0.0008537836838513613,-0.00040926961810328066,-0.0009865857427939773,-0.04962138459086418,-0.0016011294210329652,-0.0001255352544831112,-0.0002481649280525744,-0.0009321144898422062,-0.000003695494797284482],"text_output":"3: \"颗粒剂\", 4: \"注射剂\", 5: \"口服散剂\", 6: \"滴丸剂\", 7: \"灌肠剂\", 8: \"栓"} data: {"context_logits":0.0,"cum_log_probs":-0.3562590479850769,"generation_logits":0.0,"model_name":"tensorrt_llm_bls","model_version":"1","output_log_probs":[-0.009281360544264317,-0.0006639180355705321,-0.0002992600784637034,-0.0067407069727778439,-0.000007033372639853042,-0.00038464312092401087,-0.0027929767966270448,-0.04418620467185974,-0.0034859071020036937,-0.0005986097385175526,-0.0004702720034401864,-0.0023520810063928367,-0.00009978315210901201,-0.0003041491436306387,-0.003698743646964431,-0.06340079754590988,-0.0034113232977688314,-0.00033312622690573335,-0.00041213183430954814,-0.0017900982638821006,-0.00021764023404102772,-0.0004938868223689497,-0.0001466381800128147,-0.0022674258798360826,-0.08003108203411102,-0.0024388942401856186,-0.00025841951719485223,-0.0005590689834207296,-0.00748544093221426,-0.00010121380910277367,-9.536747711536009e-7,-0.0005720701883547008,-0.001312699867412448,-0.049526430666446689,-0.002000842010602355,-0.00032489807927049696,-0.0005064100841991603,-0.008147197775542736,-0.0008537836838513613,-0.00040926961810328066,-0.0009865857427939773,-0.04962138459086418,-0.0016011294210329652,-0.0001255352544831112,-0.0002481649280525744,-0.0009321144898422062,-0.000003695494797284482,-0.0003730754542630166],"text_output":"3: \"颗粒剂\", 4: \"注射剂\", 5: \"口服散剂\", 6: \"滴丸剂\", 7: \"灌肠剂\", 8: \"栓剂"}
Do u have a better solution?
Thanks.
This is as expected. accumulate_tokens
only ensures that final result is correct. As said before, its the limitation of the tokenizer.
A simple way is adding some dirty works on client side, like removing redundant word, discarding abnormal sentence.
Or you could add a check logic in tensorrt_llm_bls
, if the decoded tokens is normal
, then send response with one word; else do nothing. To do it you probably need to maintain the status of latest several decoded tokens of each request, not that easy as turning on accumulate_tokens
, but please, look through the bls code, you can do it.
Or, waiting for others' convenient solution
have you tried seeding request with stream=false? to confirm that its a tokenizer's decoding issue or accuracy issue.
@handoku Yes, if stream=false, the Chinese in the inference results will not be garbled.
But, I need to use streaming mode.
In fact, I tried accumulate_tokens mode, and the garbled character can be replaced, but in streaming mode, I have to give one normal word, not all the response every time.
data: {"context_logits":0.0,"cum_log_probs":-0.35588228702545168,"generation_logits":0.0,"model_name":"tensorrt_llm_bls","model_version":"1","output_log_probs":[-0.009281360544264317,-0.0006639180355705321,-0.0002992600784637034,-0.0067407069727778439,-0.000007033372639853042,-0.00038464312092401087,-0.0027929767966270448,-0.04418620467185974,-0.0034859071020036937,-0.0005986097385175526,-0.0004702720034401864,-0.0023520810063928367,-0.00009978315210901201,-0.0003041491436306387,-0.003698743646964431,-0.06340079754590988,-0.0034113232977688314,-0.00033312622690573335,-0.00041213183430954814,-0.0017900982638821006,-0.00021764023404102772,-0.0004938868223689497,-0.0001466381800128147,-0.0022674258798360826,-0.08003108203411102,-0.0024388942401856186,-0.00025841951719485223,-0.0005590689834207296,-0.00748544093221426,-0.00010121380910277367,-9.536747711536009e-7,-0.0005720701883547008,-0.001312699867412448,-0.049526430666446689,-0.002000842010602355,-0.00032489807927049696,-0.0005064100841991603,-0.008147197775542736,-0.0008537836838513613,-0.00040926961810328066,-0.0009865857427939773,-0.04962138459086418,-0.0016011294210329652,-0.0001255352544831112,-0.0002481649280525744,-0.0009321144898422062],"text_output":"3: \"颗粒剂\", 4: \"注射剂\", 5: \"口服散剂\", 6: \"滴丸剂\", 7: \"灌肠剂\", 8: \"�"} data: {"context_logits":0.0,"cum_log_probs":-0.35588598251342776,"generation_logits":0.0,"model_name":"tensorrt_llm_bls","model_version":"1","output_log_probs":[-0.009281360544264317,-0.0006639180355705321,-0.0002992600784637034,-0.0067407069727778439,-0.000007033372639853042,-0.00038464312092401087,-0.0027929767966270448,-0.04418620467185974,-0.0034859071020036937,-0.0005986097385175526,-0.0004702720034401864,-0.0023520810063928367,-0.00009978315210901201,-0.0003041491436306387,-0.003698743646964431,-0.06340079754590988,-0.0034113232977688314,-0.00033312622690573335,-0.00041213183430954814,-0.0017900982638821006,-0.00021764023404102772,-0.0004938868223689497,-0.0001466381800128147,-0.0022674258798360826,-0.08003108203411102,-0.0024388942401856186,-0.00025841951719485223,-0.0005590689834207296,-0.00748544093221426,-0.00010121380910277367,-9.536747711536009e-7,-0.0005720701883547008,-0.001312699867412448,-0.049526430666446689,-0.002000842010602355,-0.00032489807927049696,-0.0005064100841991603,-0.008147197775542736,-0.0008537836838513613,-0.00040926961810328066,-0.0009865857427939773,-0.04962138459086418,-0.0016011294210329652,-0.0001255352544831112,-0.0002481649280525744,-0.0009321144898422062,-0.000003695494797284482],"text_output":"3: \"颗粒剂\", 4: \"注射剂\", 5: \"口服散剂\", 6: \"滴丸剂\", 7: \"灌肠剂\", 8: \"栓"} data: {"context_logits":0.0,"cum_log_probs":-0.3562590479850769,"generation_logits":0.0,"model_name":"tensorrt_llm_bls","model_version":"1","output_log_probs":[-0.009281360544264317,-0.0006639180355705321,-0.0002992600784637034,-0.0067407069727778439,-0.000007033372639853042,-0.00038464312092401087,-0.0027929767966270448,-0.04418620467185974,-0.0034859071020036937,-0.0005986097385175526,-0.0004702720034401864,-0.0023520810063928367,-0.00009978315210901201,-0.0003041491436306387,-0.003698743646964431,-0.06340079754590988,-0.0034113232977688314,-0.00033312622690573335,-0.00041213183430954814,-0.0017900982638821006,-0.00021764023404102772,-0.0004938868223689497,-0.0001466381800128147,-0.0022674258798360826,-0.08003108203411102,-0.0024388942401856186,-0.00025841951719485223,-0.0005590689834207296,-0.00748544093221426,-0.00010121380910277367,-9.536747711536009e-7,-0.0005720701883547008,-0.001312699867412448,-0.049526430666446689,-0.002000842010602355,-0.00032489807927049696,-0.0005064100841991603,-0.008147197775542736,-0.0008537836838513613,-0.00040926961810328066,-0.0009865857427939773,-0.04962138459086418,-0.0016011294210329652,-0.0001255352544831112,-0.0002481649280525744,-0.0009321144898422062,-0.000003695494797284482,-0.0003730754542630166],"text_output":"3: \"颗粒剂\", 4: \"注射剂\", 5: \"口服散剂\", 6: \"滴丸剂\", 7: \"灌肠剂\", 8: \"栓剂"}
Do u have a better solution?
Thanks.
This is as expected.
accumulate_tokens
only ensures that final result is correct. As said before, its the limitation of the tokenizer.A simple way is adding some dirty works on client side, like removing redundant word, discarding abnormal sentence.
Or you could add a check logic in
tensorrt_llm_bls
, if the decoded tokens isnormal
, then send response with one word; else do nothing. To do it you probably need to maintain the status of latest several decoded tokens of each request, not that easy as turning onaccumulate_tokens
, but please, look through the bls code, you can do it.Or, waiting for others' convenient solution
@handoku Thanks for your reply.
I also think adding a sliding window during decoding may be a good way.
Eg, create a sliding window with length of 4, and accumulate the new token to the window. And we only decode the tokens within the window, then return the first character.
In this way, we can ensure that there are no garbled characters while also avoiding decoding all tokens every time.
What do you think?
I'm actually looking at the decode.py file in bls and trying to figure out the best way to handle this.
To resolve the � char in accumlate_tokens: true
mode, we can simply add errors='ignore'
to tokenizer.decode
in postprocessing script. This will strip all � outputs.
To resolve the � char in
accumlate_tokens: true
mode, we can simply adderrors='ignore'
totokenizer.decode
in postprocessing script. This will strip all � outputs.
@wxsms So cool, could u please give more details? Such as the code?
Thanks.
for example (in proporcessing/1/model.py):
def _postprocessing(self, tokens_batch, sequence_lengths):
outputs = []
for batch_idx, beam_tokens in enumerate(tokens_batch):
for beam_idx, tokens in enumerate(beam_tokens):
seq_len = sequence_lengths[batch_idx][beam_idx]
output = self.tokenizer.decode(
tokens[:seq_len],
skip_special_tokens=self.skip_special_tokens,
errors='ignore'
)
outputs.append(output.encode('utf8'))
return outputs
for example (in proporcessing/1/model.py):
def _postprocessing(self, tokens_batch, sequence_lengths): outputs = [] for batch_idx, beam_tokens in enumerate(tokens_batch): for beam_idx, tokens in enumerate(beam_tokens): seq_len = sequence_lengths[batch_idx][beam_idx] output = self.tokenizer.decode( tokens[:seq_len], skip_special_tokens=self.skip_special_tokens, errors='ignore' ) outputs.append(output.encode('utf8')) return outputs
@wxsms Got it, thanks.
But will the � character be retained? Or just skip it?
Hi @handoku I found a problem when using accumulate_tokens
.
When the prompt and parameters are the same, I use APIs of ensemble
and tensorrt_llm_bls
, the results are different.
curl -X POST localhost:8820/v2/models/tensorrt_llm_bls/generate_stream
curl -X POST localhost:8820/v2/models/tensorrt_llm_bls/generate_stream -d '{"text_input": "\u003creponame\u003ecommon\n\u003cneighbor\u003e\u003cfilename\u003evalue\u003ccodeblock\u003e// Compare this snippet from waitpush/DrugRemindPush.go:...\u003cneighbor\u003e\u003cfilename\u003ekey\u003ccodeblock\u003eDrugRemindPush.go\u003cfilename\u003edosage_form.go\n\u003c|fim▁begin|\u003e\u003creponame\u003eprogramming-language-demo\n\u003cneighbor\u003e\u003cfilename\u003eprime-number.go\u003ccodeblock\u003e// }\n// func isPrime(n int) bool {\n// if n \u003c 2 {\n// return false\n// } else {\n// for i := 2; i \u003c= n/2; i++ {\n// if n%i == 0 {\n// return false\n// }\n// }\n// }\n// return true\n// }\n// Functions from import file go/prime-number.go can be referenced:\n// func exitWithError()\n// func main()\n// func isPrime(n int) bool\n// Compare this snippet from go/prime-number.go:\n// package main\n// \n// import (\n// \"fmt\"\n// \"os\"\n// \"strconv\"\n// )\n// \n// func isPrime(n int) bool {\n// if n \u003c 2 {\n// return false\n// } else {\n// for i := 2; i \u003c= n/2; i++ {\n// if n%i == 0 {\n// return false\n// }\n// }\n// }\n// return true\n// }\n// \n// func exitWithError() {\n// fmt.Println(\"Usage: please input a non-negative integer\")\n// os.Exit(1)\n// }\n// \n// func main() {\n// if len(os.Args) != 2 {\n// exitWithError()\n// }\n// \n// n, err := strconv.Atoi(os.Args[1])\n// if err != nil || n \u003c 0 {\n// exitWithError()\n// }\n// \n// if isPrime(n) {\n// fmt.Println(\"Prime\")\n// } else {\n// fmt.Println(\"Composite\")\n// }\n// }\u003cneighbor\u003e\u003cfilename\u003eprime-number.go\u003ccodeblock\u003e// Functions from import file go/prime-number.go can be referenced:\n// func exitWithError() {\n// fmt.Println(\"Usage: please input a non-negative integer\")\n// os.Exit(1)\n// }\n// func main() {\n// if len(os.Args) != 2 {\n// exitWithError()\n// }\n// \n// n, err := strconv.Atoi(os.Args[1])\n// if err != nil || n \u003c 0 {\n// exitWithError()\n// }\n// \n// if isPrime(n) {\n// fmt.Println(\"Prime\")\n// } else {\n// fmt.Println(\"Composite\")\n// }\n// }\n// func isPrime(n int) bool {\n// if n \u003c 2 {\n// return false\n// } else {\n// for i := 2; i \u003c= n/2; i++ {\n// if n%i == 0 {\n// return false\n// }\n// }\n// }\n// return true\n// }\n// Functions from import file go/prime-number.go can be referenced:\n// func exitWithError()\n// func main()\n// func isPrime(n int) bool\n// Compare this snippet from go/prime-number.go:\n// package main\n// \n// import (\n// \"fmt\"\n// \"os\"\n// \"strconv\"\n// )\n// \n// func isPrime(n int) bool {\n// if n \u003c 2 {\n// return false\n// } else {\n// for i := 2; i \u003c= n/2; i++ {\n// if n%i == 0 {\n// return false\n// }\n// }\n// }\n// return true\n// }\n// \n// func exitWithError() {\u003cneighbor\u003e\u003cfilename\u003elongest-word.go\u003ccodeblock\u003e// Variables from import file go/longest-word.go can be referenced:\n// errorMessage = \"Usage: please provide a string\"\n// Functions from import file go/longest-word.go can be referenced:\n// func longestWordLength(str string) int {\n// words := strings.FieldsFunc(str, isLimitedWhitespace)\n// return longestStringLength(words)\n// }\n// func isLimitedWhitespace(r rune) bool {\n// return strings.ContainsRune(\" \\t\\n\\r\", r)\n// }\n// func longestStringLength(strs []string) (longest int) {\n// for _, str := range strs {\n// if len(str) \u003e longest {\n// longest = len(str)\n// }\n// }\n// return\n// }\n// Functions from import file go/longest-word.go can be referenced:\n// func longestWordLength(str string) int\n// func isLimitedWhitespace(r rune) bool\n// func longestStringLength(strs []string) (longest int)\u003cneighbor\u003e\u003cfilename\u003efactorial.go\u003ccodeblock\u003e// Functions from import file go/factorial.go can be referenced:\n// func exitWithError(msg string) {\n// fmt.Println(msg)\n// os.Exit(1)\n// }\n// func factorial(n uint64) uint64 {\n// if n \u003c= 0 {\n// return 1\n// }\n// return n * factorial(n-1)\n// }\n// Functions from import file go/factorial.go can be referenced:\n// func exitWithError(msg string)\n// func factorial(n uint64) uint64\u003cfilename\u003elongest-common-subsequence.go\n\u003ccodecontent\u003epackage main\nimport (\n \"encoding/json\"\n \"fmt\"\n \"os\"\n \"regexp\"\n \"strconv\"\n \"strings\"\n)\n//exitWithError\n\u003c|fim▁end|\u003e}\n\u003c|fim▁hole|\u003e", "max_tokens": 50, "bad_words": "", "stop_words": "", "stream": false, "temperature": 0.2, "top_p": 0.95, "return_log_probs": true, "generation_logits": true}'
The result is:
data: {"context_logits":0.0,"cum_log_probs":-77.98719787597656,"generation_logits":0.0,"model_name":"tensorrt_llm_bls","model_version":"1","output_log_probs":[-1.3984918594360352,-3.991654872894287,-2.127605676651001,-0.18318799138069154,-0.15039844810962678,-0.3713747262954712,-2.1666009426116945,-0.03320259973406792,-0.6704073548316956,-3.395005941390991,-6.215298652648926,-3.6144485473632814,-3.8179116249084474,-1.1550722122192383,-1.0524828433990479,-0.32207995653152468,-0.4670903980731964,-5.648696422576904,-3.6973865032196047,-3.8024346828460695,-0.13288161158561707,-3.7232208251953127,-2.065372943878174,-0.026736034080386163,-0.30800527334213259,-0.15478214621543885,-3.5880002975463869,-2.564371109008789,-1.118330717086792,-0.008484973572194577,-1.2587940692901612,-0.5912411212921143,-2.966789484024048,-2.6259653568267824,-0.009489176794886589,-0.018396474421024324,-0.12405481934547425,-2.876150131225586,-0.15892530977725984,-3.3690268993377687,-3.163250684738159,-1.4551129341125489,-0.021045353263616563,-0.0005316358874551952,-0.05893709510564804,-1.1418265104293824,-0.00010598267544992268,-0.03211848437786102,-0.10972829163074494,-0.03469150885939598],"text_output":"//findLCS\n//main\n//func removeWhiteSpace\n//func processCommandLineArgs\n//func main() {\n// var (\n// lcs = findLCS(os.Args[1"}
The part of text_output is:
//findLCS
//main
//func removeWhiteSpace
//func processCommandLineArgs
//func main() {
// var (
// lcs = findLCS(os.Args[1
curl -X POST localhost:8820/v2/models/ensemble/generate_stream
curl -X POST localhost:8820/v2/models/ensemble/generate_stream -d '{"text_input": "\u003creponame\u003ecommon\n\u003cneighbor\u003e\u003cfilename\u003evalue\u003ccodeblock\u003e// Compare this snippet from waitpush/DrugRemindPush.go:...\u003cneighbor\u003e\u003cfilename\u003ekey\u003ccodeblock\u003eDrugRemindPush.go\u003cfilename\u003edosage_form.go\n\u003c|fim▁begin|\u003e\u003creponame\u003eprogramming-language-demo\n\u003cneighbor\u003e\u003cfilename\u003eprime-number.go\u003ccodeblock\u003e// }\n// func isPrime(n int) bool {\n// if n \u003c 2 {\n// return false\n// } else {\n// for i := 2; i \u003c= n/2; i++ {\n// if n%i == 0 {\n// return false\n// }\n// }\n// }\n// return true\n// }\n// Functions from import file go/prime-number.go can be referenced:\n// func exitWithError()\n// func main()\n// func isPrime(n int) bool\n// Compare this snippet from go/prime-number.go:\n// package main\n// \n// import (\n// \"fmt\"\n// \"os\"\n// \"strconv\"\n// )\n// \n// func isPrime(n int) bool {\n// if n \u003c 2 {\n// return false\n// } else {\n// for i := 2; i \u003c= n/2; i++ {\n// if n%i == 0 {\n// return false\n// }\n// }\n// }\n// return true\n// }\n// \n// func exitWithError() {\n// fmt.Println(\"Usage: please input a non-negative integer\")\n// os.Exit(1)\n// }\n// \n// func main() {\n// if len(os.Args) != 2 {\n// exitWithError()\n// }\n// \n// n, err := strconv.Atoi(os.Args[1])\n// if err != nil || n \u003c 0 {\n// exitWithError()\n// }\n// \n// if isPrime(n) {\n// fmt.Println(\"Prime\")\n// } else {\n// fmt.Println(\"Composite\")\n// }\n// }\u003cneighbor\u003e\u003cfilename\u003eprime-number.go\u003ccodeblock\u003e// Functions from import file go/prime-number.go can be referenced:\n// func exitWithError() {\n// fmt.Println(\"Usage: please input a non-negative integer\")\n// os.Exit(1)\n// }\n// func main() {\n// if len(os.Args) != 2 {\n// exitWithError()\n// }\n// \n// n, err := strconv.Atoi(os.Args[1])\n// if err != nil || n \u003c 0 {\n// exitWithError()\n// }\n// \n// if isPrime(n) {\n// fmt.Println(\"Prime\")\n// } else {\n// fmt.Println(\"Composite\")\n// }\n// }\n// func isPrime(n int) bool {\n// if n \u003c 2 {\n// return false\n// } else {\n// for i := 2; i \u003c= n/2; i++ {\n// if n%i == 0 {\n// return false\n// }\n// }\n// }\n// return true\n// }\n// Functions from import file go/prime-number.go can be referenced:\n// func exitWithError()\n// func main()\n// func isPrime(n int) bool\n// Compare this snippet from go/prime-number.go:\n// package main\n// \n// import (\n// \"fmt\"\n// \"os\"\n// \"strconv\"\n// )\n// \n// func isPrime(n int) bool {\n// if n \u003c 2 {\n// return false\n// } else {\n// for i := 2; i \u003c= n/2; i++ {\n// if n%i == 0 {\n// return false\n// }\n// }\n// }\n// return true\n// }\n// \n// func exitWithError() {\u003cneighbor\u003e\u003cfilename\u003elongest-word.go\u003ccodeblock\u003e// Variables from import file go/longest-word.go can be referenced:\n// errorMessage = \"Usage: please provide a string\"\n// Functions from import file go/longest-word.go can be referenced:\n// func longestWordLength(str string) int {\n// words := strings.FieldsFunc(str, isLimitedWhitespace)\n// return longestStringLength(words)\n// }\n// func isLimitedWhitespace(r rune) bool {\n// return strings.ContainsRune(\" \\t\\n\\r\", r)\n// }\n// func longestStringLength(strs []string) (longest int) {\n// for _, str := range strs {\n// if len(str) \u003e longest {\n// longest = len(str)\n// }\n// }\n// return\n// }\n// Functions from import file go/longest-word.go can be referenced:\n// func longestWordLength(str string) int\n// func isLimitedWhitespace(r rune) bool\n// func longestStringLength(strs []string) (longest int)\u003cneighbor\u003e\u003cfilename\u003efactorial.go\u003ccodeblock\u003e// Functions from import file go/factorial.go can be referenced:\n// func exitWithError(msg string) {\n// fmt.Println(msg)\n// os.Exit(1)\n// }\n// func factorial(n uint64) uint64 {\n// if n \u003c= 0 {\n// return 1\n// }\n// return n * factorial(n-1)\n// }\n// Functions from import file go/factorial.go can be referenced:\n// func exitWithError(msg string)\n// func factorial(n uint64) uint64\u003cfilename\u003elongest-common-subsequence.go\n\u003ccodecontent\u003epackage main\nimport (\n \"encoding/json\"\n \"fmt\"\n \"os\"\n \"regexp\"\n \"strconv\"\n \"strings\"\n)\n//exitWithError\n\u003c|fim▁end|\u003e}\n\u003c|fim▁hole|\u003e", "max_tokens": 50, "bad_words": "", "stop_words": "", "stream": false, "temperature": 0.2, "top_p": 0.95, "return_log_probs": true, "generation_logits": true}'
The result is:
data: {"context_logits":0.0,"cum_log_probs":-77.98719787597656,"generation_logits":0.0,"model_name":"tensorrt_llm_bls","model_version":"1","output_log_probs":[-1.3984918594360352,-3.991654872894287,-2.127605676651001,-0.18318799138069154,-0.15039844810962678,-0.3713747262954712,-2.1666009426116945,-0.03320259973406792,-0.6704073548316956,-3.395005941390991,-6.215298652648926,-3.6144485473632814,-3.8179116249084474,-1.1550722122192383,-1.0524828433990479,-0.32207995653152468,-0.4670903980731964,-5.648696422576904,-3.6973865032196047,-3.8024346828460695,-0.13288161158561707,-3.7232208251953127,-2.065372943878174,-0.026736034080386163,-0.30800527334213259,-0.15478214621543885,-3.5880002975463869,-2.564371109008789,-1.118330717086792,-0.008484973572194577,-1.2587940692901612,-0.5912411212921143,-2.966789484024048,-2.6259653568267824,-0.009489176794886589,-0.018396474421024324,-0.12405481934547425,-2.876150131225586,-0.15892530977725984,-3.3690268993377687,-3.163250684738159,-1.4551129341125489,-0.021045353263616563,-0.0005316358874551952,-0.05893709510564804,-1.1418265104293824,-0.00010598267544992268,-0.03211848437786102,-0.10972829163074494,-0.03469150885939598],"text_output":"//findLCS\n//main\n//func removeWhiteSpace\n//func processCommandLineArgs\n//func main() {\n// var (\n// lcs = findLCS(os.Args[1"}
The part of text_output is:
func exitWithError(msg string) {
fmt.Println(msg)
os.Exit(1)
}
//longestCommonSubsequence
func longestCommonSubsequence(a, b string) string {
In fact, the result of ensemble
is expected.
I'm confused as to why this is happening, I think the results just should be the same.
Have you ever met this problem?
Thanks.
Hi @handoku I found a problem when using
accumulate_tokens
.When the prompt and parameters are the same, I use APIs of
ensemble
andtensorrt_llm_bls
, the results are different.curl -X POST localhost:8820/v2/models/tensorrt_llm_bls/generate_stream
curl -X POST localhost:8820/v2/models/tensorrt_llm_bls/generate_stream -d '{"text_input": "\u003creponame\u003ecommon\n\u003cneighbor\u003e\u003cfilename\u003evalue\u003ccodeblock\u003e// Compare this snippet from waitpush/DrugRemindPush.go:...\u003cneighbor\u003e\u003cfilename\u003ekey\u003ccodeblock\u003eDrugRemindPush.go\u003cfilename\u003edosage_form.go\n\u003c|fim▁begin|\u003e\u003creponame\u003eprogramming-language-demo\n\u003cneighbor\u003e\u003cfilename\u003eprime-number.go\u003ccodeblock\u003e// }\n// func isPrime(n int) bool {\n// if n \u003c 2 {\n// return false\n// } else {\n// for i := 2; i \u003c= n/2; i++ {\n// if n%i == 0 {\n// return false\n// }\n// }\n// }\n// return true\n// }\n// Functions from import file go/prime-number.go can be referenced:\n// func exitWithError()\n// func main()\n// func isPrime(n int) bool\n// Compare this snippet from go/prime-number.go:\n// package main\n// \n// import (\n// \"fmt\"\n// \"os\"\n// \"strconv\"\n// )\n// \n// func isPrime(n int) bool {\n// if n \u003c 2 {\n// return false\n// } else {\n// for i := 2; i \u003c= n/2; i++ {\n// if n%i == 0 {\n// return false\n// }\n// }\n// }\n// return true\n// }\n// \n// func exitWithError() {\n// fmt.Println(\"Usage: please input a non-negative integer\")\n// os.Exit(1)\n// }\n// \n// func main() {\n// if len(os.Args) != 2 {\n// exitWithError()\n// }\n// \n// n, err := strconv.Atoi(os.Args[1])\n// if err != nil || n \u003c 0 {\n// exitWithError()\n// }\n// \n// if isPrime(n) {\n// fmt.Println(\"Prime\")\n// } else {\n// fmt.Println(\"Composite\")\n// }\n// }\u003cneighbor\u003e\u003cfilename\u003eprime-number.go\u003ccodeblock\u003e// Functions from import file go/prime-number.go can be referenced:\n// func exitWithError() {\n// fmt.Println(\"Usage: please input a non-negative integer\")\n// os.Exit(1)\n// }\n// func main() {\n// if len(os.Args) != 2 {\n// exitWithError()\n// }\n// \n// n, err := strconv.Atoi(os.Args[1])\n// if err != nil || n \u003c 0 {\n// exitWithError()\n// }\n// \n// if isPrime(n) {\n// fmt.Println(\"Prime\")\n// } else {\n// fmt.Println(\"Composite\")\n// }\n// }\n// func isPrime(n int) bool {\n// if n \u003c 2 {\n// return false\n// } else {\n// for i := 2; i \u003c= n/2; i++ {\n// if n%i == 0 {\n// return false\n// }\n// }\n// }\n// return true\n// }\n// Functions from import file go/prime-number.go can be referenced:\n// func exitWithError()\n// func main()\n// func isPrime(n int) bool\n// Compare this snippet from go/prime-number.go:\n// package main\n// \n// import (\n// \"fmt\"\n// \"os\"\n// \"strconv\"\n// )\n// \n// func isPrime(n int) bool {\n// if n \u003c 2 {\n// return false\n// } else {\n// for i := 2; i \u003c= n/2; i++ {\n// if n%i == 0 {\n// return false\n// }\n// }\n// }\n// return true\n// }\n// \n// func exitWithError() {\u003cneighbor\u003e\u003cfilename\u003elongest-word.go\u003ccodeblock\u003e// Variables from import file go/longest-word.go can be referenced:\n// errorMessage = \"Usage: please provide a string\"\n// Functions from import file go/longest-word.go can be referenced:\n// func longestWordLength(str string) int {\n// words := strings.FieldsFunc(str, isLimitedWhitespace)\n// return longestStringLength(words)\n// }\n// func isLimitedWhitespace(r rune) bool {\n// return strings.ContainsRune(\" \\t\\n\\r\", r)\n// }\n// func longestStringLength(strs []string) (longest int) {\n// for _, str := range strs {\n// if len(str) \u003e longest {\n// longest = len(str)\n// }\n// }\n// return\n// }\n// Functions from import file go/longest-word.go can be referenced:\n// func longestWordLength(str string) int\n// func isLimitedWhitespace(r rune) bool\n// func longestStringLength(strs []string) (longest int)\u003cneighbor\u003e\u003cfilename\u003efactorial.go\u003ccodeblock\u003e// Functions from import file go/factorial.go can be referenced:\n// func exitWithError(msg string) {\n// fmt.Println(msg)\n// os.Exit(1)\n// }\n// func factorial(n uint64) uint64 {\n// if n \u003c= 0 {\n// return 1\n// }\n// return n * factorial(n-1)\n// }\n// Functions from import file go/factorial.go can be referenced:\n// func exitWithError(msg string)\n// func factorial(n uint64) uint64\u003cfilename\u003elongest-common-subsequence.go\n\u003ccodecontent\u003epackage main\nimport (\n \"encoding/json\"\n \"fmt\"\n \"os\"\n \"regexp\"\n \"strconv\"\n \"strings\"\n)\n//exitWithError\n\u003c|fim▁end|\u003e}\n\u003c|fim▁hole|\u003e", "max_tokens": 50, "bad_words": "", "stop_words": "", "stream": false, "temperature": 0.2, "top_p": 0.95, "return_log_probs": true, "generation_logits": true}'
The result is:
data: {"context_logits":0.0,"cum_log_probs":-77.98719787597656,"generation_logits":0.0,"model_name":"tensorrt_llm_bls","model_version":"1","output_log_probs":[-1.3984918594360352,-3.991654872894287,-2.127605676651001,-0.18318799138069154,-0.15039844810962678,-0.3713747262954712,-2.1666009426116945,-0.03320259973406792,-0.6704073548316956,-3.395005941390991,-6.215298652648926,-3.6144485473632814,-3.8179116249084474,-1.1550722122192383,-1.0524828433990479,-0.32207995653152468,-0.4670903980731964,-5.648696422576904,-3.6973865032196047,-3.8024346828460695,-0.13288161158561707,-3.7232208251953127,-2.065372943878174,-0.026736034080386163,-0.30800527334213259,-0.15478214621543885,-3.5880002975463869,-2.564371109008789,-1.118330717086792,-0.008484973572194577,-1.2587940692901612,-0.5912411212921143,-2.966789484024048,-2.6259653568267824,-0.009489176794886589,-0.018396474421024324,-0.12405481934547425,-2.876150131225586,-0.15892530977725984,-3.3690268993377687,-3.163250684738159,-1.4551129341125489,-0.021045353263616563,-0.0005316358874551952,-0.05893709510564804,-1.1418265104293824,-0.00010598267544992268,-0.03211848437786102,-0.10972829163074494,-0.03469150885939598],"text_output":"//findLCS\n//main\n//func removeWhiteSpace\n//func processCommandLineArgs\n//func main() {\n// var (\n// lcs = findLCS(os.Args[1"}
The part of text_output is:
//findLCS //main //func removeWhiteSpace //func processCommandLineArgs //func main() { // var ( // lcs = findLCS(os.Args[1
curl -X POST localhost:8820/v2/models/ensemble/generate_stream
curl -X POST localhost:8820/v2/models/ensemble/generate_stream -d '{"text_input": "\u003creponame\u003ecommon\n\u003cneighbor\u003e\u003cfilename\u003evalue\u003ccodeblock\u003e// Compare this snippet from waitpush/DrugRemindPush.go:...\u003cneighbor\u003e\u003cfilename\u003ekey\u003ccodeblock\u003eDrugRemindPush.go\u003cfilename\u003edosage_form.go\n\u003c|fim▁begin|\u003e\u003creponame\u003eprogramming-language-demo\n\u003cneighbor\u003e\u003cfilename\u003eprime-number.go\u003ccodeblock\u003e// }\n// func isPrime(n int) bool {\n// if n \u003c 2 {\n// return false\n// } else {\n// for i := 2; i \u003c= n/2; i++ {\n// if n%i == 0 {\n// return false\n// }\n// }\n// }\n// return true\n// }\n// Functions from import file go/prime-number.go can be referenced:\n// func exitWithError()\n// func main()\n// func isPrime(n int) bool\n// Compare this snippet from go/prime-number.go:\n// package main\n// \n// import (\n// \"fmt\"\n// \"os\"\n// \"strconv\"\n// )\n// \n// func isPrime(n int) bool {\n// if n \u003c 2 {\n// return false\n// } else {\n// for i := 2; i \u003c= n/2; i++ {\n// if n%i == 0 {\n// return false\n// }\n// }\n// }\n// return true\n// }\n// \n// func exitWithError() {\n// fmt.Println(\"Usage: please input a non-negative integer\")\n// os.Exit(1)\n// }\n// \n// func main() {\n// if len(os.Args) != 2 {\n// exitWithError()\n// }\n// \n// n, err := strconv.Atoi(os.Args[1])\n// if err != nil || n \u003c 0 {\n// exitWithError()\n// }\n// \n// if isPrime(n) {\n// fmt.Println(\"Prime\")\n// } else {\n// fmt.Println(\"Composite\")\n// }\n// }\u003cneighbor\u003e\u003cfilename\u003eprime-number.go\u003ccodeblock\u003e// Functions from import file go/prime-number.go can be referenced:\n// func exitWithError() {\n// fmt.Println(\"Usage: please input a non-negative integer\")\n// os.Exit(1)\n// }\n// func main() {\n// if len(os.Args) != 2 {\n// exitWithError()\n// }\n// \n// n, err := strconv.Atoi(os.Args[1])\n// if err != nil || n \u003c 0 {\n// exitWithError()\n// }\n// \n// if isPrime(n) {\n// fmt.Println(\"Prime\")\n// } else {\n// fmt.Println(\"Composite\")\n// }\n// }\n// func isPrime(n int) bool {\n// if n \u003c 2 {\n// return false\n// } else {\n// for i := 2; i \u003c= n/2; i++ {\n// if n%i == 0 {\n// return false\n// }\n// }\n// }\n// return true\n// }\n// Functions from import file go/prime-number.go can be referenced:\n// func exitWithError()\n// func main()\n// func isPrime(n int) bool\n// Compare this snippet from go/prime-number.go:\n// package main\n// \n// import (\n// \"fmt\"\n// \"os\"\n// \"strconv\"\n// )\n// \n// func isPrime(n int) bool {\n// if n \u003c 2 {\n// return false\n// } else {\n// for i := 2; i \u003c= n/2; i++ {\n// if n%i == 0 {\n// return false\n// }\n// }\n// }\n// return true\n// }\n// \n// func exitWithError() {\u003cneighbor\u003e\u003cfilename\u003elongest-word.go\u003ccodeblock\u003e// Variables from import file go/longest-word.go can be referenced:\n// errorMessage = \"Usage: please provide a string\"\n// Functions from import file go/longest-word.go can be referenced:\n// func longestWordLength(str string) int {\n// words := strings.FieldsFunc(str, isLimitedWhitespace)\n// return longestStringLength(words)\n// }\n// func isLimitedWhitespace(r rune) bool {\n// return strings.ContainsRune(\" \\t\\n\\r\", r)\n// }\n// func longestStringLength(strs []string) (longest int) {\n// for _, str := range strs {\n// if len(str) \u003e longest {\n// longest = len(str)\n// }\n// }\n// return\n// }\n// Functions from import file go/longest-word.go can be referenced:\n// func longestWordLength(str string) int\n// func isLimitedWhitespace(r rune) bool\n// func longestStringLength(strs []string) (longest int)\u003cneighbor\u003e\u003cfilename\u003efactorial.go\u003ccodeblock\u003e// Functions from import file go/factorial.go can be referenced:\n// func exitWithError(msg string) {\n// fmt.Println(msg)\n// os.Exit(1)\n// }\n// func factorial(n uint64) uint64 {\n// if n \u003c= 0 {\n// return 1\n// }\n// return n * factorial(n-1)\n// }\n// Functions from import file go/factorial.go can be referenced:\n// func exitWithError(msg string)\n// func factorial(n uint64) uint64\u003cfilename\u003elongest-common-subsequence.go\n\u003ccodecontent\u003epackage main\nimport (\n \"encoding/json\"\n \"fmt\"\n \"os\"\n \"regexp\"\n \"strconv\"\n \"strings\"\n)\n//exitWithError\n\u003c|fim▁end|\u003e}\n\u003c|fim▁hole|\u003e", "max_tokens": 50, "bad_words": "", "stop_words": "", "stream": false, "temperature": 0.2, "top_p": 0.95, "return_log_probs": true, "generation_logits": true}'
The result is:
data: {"context_logits":0.0,"cum_log_probs":-77.98719787597656,"generation_logits":0.0,"model_name":"tensorrt_llm_bls","model_version":"1","output_log_probs":[-1.3984918594360352,-3.991654872894287,-2.127605676651001,-0.18318799138069154,-0.15039844810962678,-0.3713747262954712,-2.1666009426116945,-0.03320259973406792,-0.6704073548316956,-3.395005941390991,-6.215298652648926,-3.6144485473632814,-3.8179116249084474,-1.1550722122192383,-1.0524828433990479,-0.32207995653152468,-0.4670903980731964,-5.648696422576904,-3.6973865032196047,-3.8024346828460695,-0.13288161158561707,-3.7232208251953127,-2.065372943878174,-0.026736034080386163,-0.30800527334213259,-0.15478214621543885,-3.5880002975463869,-2.564371109008789,-1.118330717086792,-0.008484973572194577,-1.2587940692901612,-0.5912411212921143,-2.966789484024048,-2.6259653568267824,-0.009489176794886589,-0.018396474421024324,-0.12405481934547425,-2.876150131225586,-0.15892530977725984,-3.3690268993377687,-3.163250684738159,-1.4551129341125489,-0.021045353263616563,-0.0005316358874551952,-0.05893709510564804,-1.1418265104293824,-0.00010598267544992268,-0.03211848437786102,-0.10972829163074494,-0.03469150885939598],"text_output":"//findLCS\n//main\n//func removeWhiteSpace\n//func processCommandLineArgs\n//func main() {\n// var (\n// lcs = findLCS(os.Args[1"}
The part of text_output is:
func exitWithError(msg string) { fmt.Println(msg) os.Exit(1) } //longestCommonSubsequence func longestCommonSubsequence(a, b string) string {
In fact, the result of
ensemble
is expected.I'm confused as to why this is happening, I think the results just should be the same.
Have you ever met this problem?
Thanks.
@byshiue @Tracin Could u please see this problem?
Thanks.
System Info
CPU x86_64
GPU NVIDIA L20
TensorRT branch: v0.8.0
CUDA: NVIDIA-SMI 535.154.05 Driver Version: 535.154.05 CUDA Version: 12.3
Who can help?
@kaiyux @byshiue @schetlur-nv
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
I use the flowing command.
Here is the the request.
But now I find that the Chinese in the inference results is garbled.
Expected behavior
The output result is normal.
actual behavior
Some Chinese characters in the inference results are garbled.
additional notes
I have some suspicion that there is a problem with character conversion after decoding?
Hope there is a way to solve it.
Thanks.