I use the same instruction to inference with wizardcoder-python-7B, I found that the result vllm output is different from transformers. instruction: """data = [('Name', 'Age', 'Income'), ('Alice', 53, 2189140), ('Bob', 31, 5127020), ('Charlie', 61, 1335873), ('David', 50, 6634484), ('Eve', 38, 1180368), ('Frank', 62, 2370117), ('George', 19, 2973948), ('Hannah', 35, 574736), ('Isaac', 19, 9064267), ('Judy', 21, 7362180)] 根据数据，将所有人根据年龄均匀分为三组"""

vllm inference code

sampling_params = SamplingParams(temperature=0.1, top_p=0.4, max_tokens=1024, stop= ['</s>'])
completions = llm.generate(problem_instruction, sampling_params)

and the model outputs:

data = [('Name', 'Age', 'Income'),
        ('Alice', 53, 2189140),
        ('Bob', 31, 5127020),
        ('Charlie', 61, 1335873),
        ('David', 50, 6634484),
        ('Eve', 38, 1180368),
        ('Frank', 62, 2370117),
        ('George', 19, 2973948),
        ('Hannah', 35, 574736),
        ('Isaac', 19, 9064267),
        ('Judy', 21, 7362180)]

# Sort the data by age
data_sorted = sorted(data, key=lambda x: x[1])

# Divide the data into three groups
groups = []
for i in range(3):
    group = []
    for j in range(len(data_sorted)):
        if j % 3 == i:
            group.append(data_sorted[j])
    groups.append(group)

# Print the groups
for i, group in enumerate(groups):
    print(f"Group {i+1}: {group}")

This will output:

Group 1: [('George', 19, 2973948), ('Isaac', 19, 9064267), ('Alice', 53, 2189140)]
Group 2: [('Eve', 38, 1180368), ('Bob', 31, 5127020), ('Charlie', 61, 1335873)]
Group 3: [('David', 50, 6634484), ('Frank', 62, 2370117), ('Hannah', 35, 574736), ('Judy', 21, 7362180)]

transformers inference code

from transformers import AutoModelForCausalLM, AutoTokenizer
inputs = tokenizer([prompt], return_tensors="pt", max_length=1024, truncation=True, padding=True)
completions = model.generate(
    inputs['input_ids'].to(model.device),
    max_new_tokens=512,
    eos_token_id=tokenizer.eos_token_id,
    temperature=0.1,
    top_k=1,
    top_p=0.4,
    num_beams=1,
    do_sample=False,
    output_scores=True
"""
# and the model outputs:
To split the data into three groups based on age, we can use the `numpy` library to calculate the mean age and then use that to determine which group each person belongs to. Here's the code to do that:

```python
import numpy as np

# Calculate the mean age
mean_age = np.mean([row[1] for row in data[1:]])

# Determine which group each person belongs to
groups = []
for row in data[1:]:
    age = row[1]
    if age <= mean_age:
        group = 1
    elif age <= 2 * mean_age:
        group = 2
    else:
        group = 3
    groups.append(group)

# Print the groups
print("Group 1:", [row[0] for row in data[1:] if groups[data[1:].index(row)] == 1])
print("Group 2:", [row[0] for row in data[1:] if groups[data[1:].index(row)] == 2])
print("Group 3:", [row[0] for row in data[1:] if groups[data[1:].index(row)] == 3])

This will output:

Group 1: ['Alice', 'Bob', 'Charlie', 'David', 'Eve', 'Frank', 'George', 'Hannah', 'Isaac']
Group 2: ['Judy']
Group 3: []

nlpxucan / WizardLM

Does anybody find outputs difference between vllm inference and AutoClass inference for WizardCoder? #222

vllm inference code

and the model outputs:

transformers inference code