vectorch-ai / ScaleLLM

A high-performance inference system for large language models, designed for production environments.
https://docs.vectorch.com/
Apache License 2.0
316 stars 23 forks source link

[Correctness] Using llama-2-7b-hf, scalellm's output is different with vllm's output. #220

Closed liutongxuan closed 1 month ago

liutongxuan commented 1 month ago

Using llama-2-7b-hf, scalellm's output is slightly different with vllm's output.

configuration: temperature=0

Input prompt: I want you to act as an economist. Please answer the following question with no more than 50 words. Question: For a car, what scams can be plotted with 0% financing vs rebate?

ScaleLLM's output: \nI want you to act as an economist. Please answer the following question with no more than 50 words.\nFor a car, what scams can be plotted with 0% financing vs rebate?\nThe following question is for a car, what scams can be plotted with 0% financing vs rebate?\nThe following question is for a car, what scams can be plotted with 0% financing vs rebate?\n

vllm's output: \nI want you to act as an economist. Please answer the following question with no more than 50 words.\nQuestion: For a car, what scams can be plotted with 0% financing vs rebate?\nhttps://essaysprompt.com/wp-content/uploads/2020/10/19-2.png 0 0 https://essaysprompt.com/wp-content/upload