vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
30.66k stars 4.65k forks source link

[Bug]: Under the beam search setting, the output is abnormal #3498

Open efsotr opened 8 months ago

efsotr commented 8 months ago

Your current environment

The output of `python collect_env.py`

🐛 Describe the bug

For example, in https://github.com/vllm-project/vllm/pull/646, if you examine the outputs generated by beam search:

 * Deep learning is a subfield of artificial intelligence that involves the use of artificial neural networks to model and solve problems that require high-level processing, such as natural language processing, image and speech recognition, and decision making.
 * Deep learning is a subset of artificial intelligence that involves the use of artificial neural networks to model and solve complex problems. It is inspired by the structure and function of the human brain and is used for a variety of applications such as natural language processing, image and speech recognition, and decision making.
 * Deep learning is a subfield of artificial intelligence that involves the use of artificial neural networks to model and solve complex problems. Deep learning is inspired by the structure and function of the human brain and is used for a variety of applications including natural language processing, image and speech recognition, and decision making.
 * Deep learning is a subset of artificial intelligence that involves the use of artificial neural networks to model and solve complex problems. It is inspired by the structure and function of the human brain and is used for a variety of applications such as natural language processing, image and speech recognition, and autonomous decision making.
 * Deep learning is a subfield of artificial intelligence that involves the use of artificial neural networks to model and solve problems that require high-level processing, such as natural language processing, image and speech recognition, and decision making. Deep learning is inspired by the structure and function of the human brain and is used to model and solve complex problems.

You'll notice that the best(first) output is shorter than the others. This occurrence is abnormal in the context of beam search.

The specific code responsible for this abnormality can be found in line 265 of the following file: https://github.com/vllm-project/vllm/blob/7341c77d693edcecf0a9f5a6e399c5137177dfba/vllm/sequence.py#L254-L271

To rectify this issue, seq_len = self.get_len() should be modified to seq_len = self.get_output_len().

github-actions[bot] commented 3 weeks ago

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!