generate_text_func currently does not correctly return finish_reason=TOKEN_LIMIT when reaching the model token limit:
TOKEN_LIMIT refers to the maximum number of tokens limit defined by the model whereas the MAX_TOKENS refers to the maximum number defined by the user. So one can reach TOKEN_LIMIT before MAX_TOKENS
generate_text_func
currently does not correctly returnfinish_reason=TOKEN_LIMIT
when reaching the model token limit:_Originally posted by @gkumbhat in https://github.com/caikit/caikit-nlp/pull/210#discussion_r1374803238_