Open YJHMITWEB opened 1 year ago
It looks you use random number for some argument like request_output_len and the random number is invalid.
It looks you use random number for some argument like request_output_len and the random number is invalid.
Hi, @byshiue , I see what you mean, basically --shape only specifies the shape instead of the actual value. So it should be that we only need the shape of request_output_len
to be 1, and specify the actual value to 30. I am wondering if it is possible to pass the value to perf_analyzer
or if it has to be done in Python script assuming there is such a perf_analyzer
API to call?
You can pass values to perf_analyzer
. For more details, you can ask in tritonserver repo.
Hi, I am trying to use
perf_analyzer
on the predefined models in fastertransformer, such as gpt, gptj, and etc.I am very confused about how to properly set the
--shape
of different inputs when usingperf_analyzer
.For example, given the
config.ini
of the model:And given the gpt
config.pbtxt
underall_models/gpt/fastertransformer
:When I use
perf_analyzer
, it asks me to specify the --shape of the following inputs:bad_words_list
,input_ids
,request_output_len
,request_prompt_embedding
,stop_words_list
, so I set them asBut this gives errors like the following:
It seems there are some memory illegal access. So, what I expect is the batch size to be 10, and each output with length of 10, for the rest params, I am confused about why and how I should set them.