Closed binzhang01 closed 2 months ago
@binzhang01 Can you share the full log with TM_LOG_LEVEL=INFO
?
@binzhang01 Can you share the full log with
TM_LOG_LEVEL=INFO
?
i try this in .py: os.environ['TM_LOG_LEVEL'] = 'INFO' and in bash: export TM_LOG_LEVEL=INFO but got same output: 2024-08-27 01:34:56,928 - lmdeploy - ERROR - Truncate max_new_tokens to 128 2024-08-27 01:34:56,929 - lmdeploy - ERROR - run out of tokens. session_id=0.
@RunningLeon may help answering the question
@binzhang01 hi, pls increase session_len
in backend config. You can refer to https://lmdeploy.readthedocs.io/en/latest/multi_modal/vl_pipeline.html#set-sampling-parameters
@binzhang01 hi, pls increase
session_len
in backend config. You can refer to https://lmdeploy.readthedocs.io/en/latest/multi_modal/vl_pipeline.html#set-sampling-parameters
thanks! problem solved. set the session_len=4096 is ok.
Checklist
Describe the bug
First, I try to use llava-hf/llama3-llava-next-8b-hf. short prompt + 500333 image, the result is right. short prompt + 640640 image, the bug shows: 2024-08-26 23:37:30,352 - lmdeploy - ERROR - Truncate max_new_tokens to 128 2024-08-26 23:37:30,353 - lmdeploy - ERROR - run out of tokens. session_id=0. and the result is None. transformers can load llava-hf/llama3-llava-next-8b-hf and produce right result. lmdeploy + lmms-lab/llama3-llava-next-8b can produce right result. the code is
Reproduction
None
Environment
Error traceback