Inconsistent Text Generation Results in Batch vs Individual Sentence Processing

haramjo commented 10 months ago

Environment:

VLLM Version: v0.2.7
HF Version: 4.37.0
Model Used: teknium/OpenHermes-2.5-Mistral-7B
Python Version: 3.10.13
Operating System: Linux-5.10.201-191.748.amzn2.x86_64-x86_64-with-glibc2.26
GPU: NVIDIA A100-SXM4-40GB
Driver Version: 535.129.03
CUDA Version: 12.2

Issue Description: When generating text using the VLLM with the Mistral-7B model, I observed inconsistent results when processing multiple sentences in a batch compared to processing them individually.

Steps to Reproduce:

from vllm import LLM
from vllm.engine.arg_utils import EngineArgs
from vllm.sampling_params import SamplingParams

engine = LLM(model="teknium/OpenHermes-2.5-Mistral-7B")

prompts = [
    "Towards the end of the video, the focus shifts to the character in the blue and silver suit. This character is shown in a close-up shot, with a serious expression on their face. The close-up shot emphasizes the intensity of the moment and the character's determination.",
    "Overall, the video captures a dynamic and action-packed scene from a superhero-themed production, with characters in superhero costumes engaged in a battle against a backdrop of an urban environment.",
]

a = [(0,), (1,)]
b = [(0, 1)]
sampling_params = SamplingParams(**{
    "max_tokens": 200,
    "use_beam_search": False,
    "temperature": .0,
})
def print_batch(prefix, batch_idx):
    prompt_subset = [prompts[i] for i in batch_idx]
    result = engine.generate(prompt_subset, sampling_params, use_tqdm=False)
    for p, r in zip(prompt_subset, result):
        print(prefix, "\nprompt: ", p, "\nresult: ", r.outputs[0].text)

for batch_idx in a:
    print_batch("a: ", batch_idx)
print("====================================")
for batch_idx in b:
    print_batch("b: ", batch_idx)

Expected Behavior: The generated text should be consistent whether the sentences are processed individually or in a batch.

Actual Output:

a:  
prompt:  Towards the end of the video, the focus shifts to the character in the blue and silver suit. This character is shown in a close-up shot, with a serious expression on their face. The close-up shot emphasizes the intensity of the moment and the character's determination. 
result:   The music also changes to a more dramatic and intense tone, further enhancing the emotional impact of the scene.

The character in the blue and silver suit is likely meant to represent a hero or a protagonist. The serious expression on their face and the dramatic music suggest that they are facing a difficult challenge or obstacle. The close-up shot also emphasizes their importance and the significance of their actions.

Overall, this scene is designed to create a sense of tension and excitement. The use of close-up shots, dramatic music, and a serious expression on the character's face all work together to create a powerful emotional impact on the viewer.
a:  
prompt:  Overall, the video captures a dynamic and action-packed scene from a superhero-themed production, with characters in superhero costumes engaged in a battle against a backdrop of an urban environment. 
result:   The video is well-edited and features a variety of camera angles and shots, which helps to create a sense of excitement and tension.

The video also features a number of special effects, including explosions, smoke, and flashes of light, which add to the overall impact of the scene. The use of music and sound effects also helps to enhance the mood and atmosphere of the video.

One of the standout aspects of the video is the choreography of the fight scenes, which is well-executed and engaging to watch. The actors involved in the video appear to be skilled in martial arts and stunt work, which adds to the realism and excitement of the scene.

Overall, the video is a great example of how to create an action-packed and visually impressive superhero-themed production. The combination of well-executed fight scenes, special effects, and strong editing makes for an engaging and exciting viewing experience.
====================================
b:  
prompt:  Towards the end of the video, the focus shifts to the character in the blue and silver suit. This character is shown in a close-up shot, with a serious expression on their face. The close-up shot emphasizes the intensity of the moment and the character's determination. 
result:   The music also changes to a more dramatic and intense tone, further enhancing the emotional impact of the scene.

The character in the blue and silver suit is likely meant to represent a hero or a protagonist. The serious expression on their face and the dramatic music suggest that they are facing a difficult challenge or obstacle. The close-up shot also emphasizes their importance and the significance of their actions.

Overall, this scene is designed to create a sense of tension and anticipation. The use of close-up shots, dramatic music, and a serious expression on the character's face all work together to create a powerful emotional impact on the viewer.
b:  
prompt:  Overall, the video captures a dynamic and action-packed scene from a superhero-themed production, with characters in superhero costumes engaged in a battle against a backdrop of an urban environment. 
result:   The video is well-edited and features a variety of camera angles and shots, giving viewers a sense of the excitement and energy of the scene.

The video also features a catchy and upbeat music track that adds to the overall atmosphere of the scene. The music is well-suited to the action-packed nature of the video and helps to create a sense of urgency and tension.

In terms of production quality, the video is very impressive. The special effects, such as the explosions and the flying characters, are well-executed and add to the overall impact of the scene. The costumes and makeup are also very well-done, with the characters looking like they have stepped straight out of a comic book.

Overall, the video is a great example of how to create an engaging and exciting superhero-themed production. The action-packed scene, combined with the music and special effects, make for a thrilling and visually

anxietymonger commented 10 months ago

Encountered similar problem with qwen 72b model.

laiqinghan commented 8 months ago

Pls did you solve this problem

davidfrankenberg commented 7 months ago

Same issue with Gemma 7B

simon376 commented 2 months ago

this is probably due to numerical instability? see the docs

vllm-project / vllm

Inconsistent Text Generation Results in Batch vs Individual Sentence Processing #2568