When we call LLMEngine.step() (through vllm or lmi-dist), it's not guaranteed that we receive and output for every active request. If on the first call to step for a request we do not generate a token, we never initialize the sequence dict that is part of TextGenerationOutput. Later in the flow we make a call to request.get_next_tokenhttps://github.com/deepjavalibrary/djl-serving/blob/1e883d5bae81a20189847cc4b2aff93ac38b8a38/engines/python/setup/djl_python/request.py#L140. For requests that did not generate a new token via step, we never call set_new_token, which is where the sequences property of TextGenerationOutput gets initialized.
The simples fix is to ensure that the sequences property of TextGenerationOutput always gets created for the default index so that we never try to access an array index that doesn't exist.
Fixes:
inf2 unit tests
Note that for some reason the generated text for one of the inputs has changed. the outputs seem fine, but i cannot figure out why this changed
… in an output
Description
A previous change caused issues with a corner case. The issue was introduced by https://github.com/deepjavalibrary/djl-serving/commit/06eaa98518bb951ae1d7cad1e8dc6787082d382d#diff-4de063c68933067104c13e58e380b04bb922ca889f070e40bfa743ab33556426.
When we call
LLMEngine.step()
(through vllm or lmi-dist), it's not guaranteed that we receive and output for every active request. If on the first call to step for a request we do not generate a token, we never initialize the sequence dict that is part of TextGenerationOutput. Later in the flow we make a call torequest.get_next_token
https://github.com/deepjavalibrary/djl-serving/blob/1e883d5bae81a20189847cc4b2aff93ac38b8a38/engines/python/setup/djl_python/request.py#L140. For requests that did not generate a new token via step, we never callset_new_token
, which is where the sequences property of TextGenerationOutput gets initialized.The simples fix is to ensure that the sequences property of TextGenerationOutput always gets created for the default index so that we never try to access an array index that doesn't exist.
Fixes:
Note that for some reason the generated text for one of the inputs has changed. the outputs seem fine, but i cannot figure out why this changed