vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
30.46k stars 4.61k forks source link

I use vllm to accelerate the large model of qwen, mainly qwen7B/qwen14B. Two issues were found during the testing of the large model. #2957

Open pftzzg opened 9 months ago

pftzzg commented 9 months ago

I use vllm to accelerate the large model of qwen, mainly qwen7B/qwen14B. Two issues were found during the testing of the large model.

1) Compared to using vllm qwen7B/qwen14B acceleration, the accuracy of the reasoning results of the single round question answering test model has decreased.

2) Compared to using vllm qwen7B/qwen14B acceleration, the accuracy of the inference results of the streaming output test model has decreased.

The vllm version is 0.3.0

xwzj01ht commented 9 months ago

The same to me.

qingjiaozyn commented 9 months ago

I use version 0.3.2 of vllm for this problem. My original question is : 忘记你已有的知识,仅使用 标记中的问答对进行回答。\n\n\nquestion:出差期间需延迟一天返程,该如何报销?\nanswer:延迟返程会造成无法通过商旅100购买返程机票,需报账人自行购买机票,并保留航空客票电子行程单,出差结束后报销,并需重提交特殊情况审批表对延迟返程和自行购买机票的情况进行说明\n\n来源:通用-会计核算与报账服务-成本-差旅费-差旅特殊审批\n责任科室:成本室\n\nquestion:出差期间需延迟一天返程,该如何报销\nanswer:延迟返程会造成无法通过商旅100购买返程机票,需报账人自行购买机票,并保留航空客票电子行程单,出差结束后报销,并需重提交特殊情况审批表对延迟返程和自行购买机票的情况进行说明\n\n来源:通用-会计核算与报账服务-成本-差旅费-差旅特殊审批\n责任科室:成本室\n\nquestion:差&发票\nanswer:差旅费发票无法在发票明细信息中查到,如何填报\发票明细信息一栏无法选择发票\n报账系统中发票池先查询发票,发票池中发票种类需选对,校验码需填对(备注栏的数字长的后六位),查询的到就直接新增到发票明细信息中即可;若未查询到则需进行手动增加。\n\n来源:通用-会计核算与报账服务-成本-差旅费-发票报账问题\n责任科室:成本室\n\nquestion:差旅费未通过商旅100购票,在填写行程车票无法出现税率?\nanswer:应在填单时填入实报金额,即可自动识别相应的税额。\n\n来源:通用-会计核算与报账服务-成本-差旅费-差旅结算单问题\n责任科室:成本室\n\nquestion:未通过商旅100购票,在填写行程车票为什么不出现税率\nanswer:应在填单时填入实报金额,即可自动识别相应的税额。\n\n来源:通用-会计核算与报账服务-成本-差旅费-差旅结算单问题\n责任科室:成本室\n\n \n\n\n思考流程:\n1.直接使用知识库中的信息来回答问题,不需要提供其他来源的信息。\n2. 判断问题是否与 标记中的内容有关。\n3. 如果无关,你直接拒绝回答本次问题。\n4. 判断是否有相近或相同的问题。\n5. 如果有相同的问题,直接输出对应答案。\n6. 如果只有相近的问题,请把相近的问题和答案一起输出。\n7. 如果问题是一个简短的关键词,例如“网盘”、“一卡通”,“充值”等,请提供所有与该关键词相关的知识库信息。\n8. 你的名字是人工智能助手,你由大规模语料训练而来。\n\n最后,避免提及你是从 QA 获取的知识,只需要回复答案。\n\n问题:未通过商旅100购票,在填写行程车票为什么不出现税率

the result of the original qwen14B is :应在填单时填入实报金额,即可自动识别相应的税额。

来源:通用-会计核算与报账服务-成本-差旅费-差旅结算单问题 责任科室:成本室

and the result of vllm's qwen14B is : 应在填单时填入实报金额,即可自动识别相应的税额。

I hope the author will integrate qwen's answer consistency optimization into the project

ArlanCooper commented 9 months ago

same to me ,the result from vllm has become randomly than original result from transformers

shiqingzhangCSU commented 9 months ago

check your SamplingParams.

ArlanCooper commented 9 months ago

check your SamplingParams.

SamplingParams(temperature=0.1,max_tokens=300,top_p=0.8) The answer tends to have repetitive text

shiqingzhangCSU commented 9 months ago

check your SamplingParams.

SamplingParams(temperature=0.1,max_tokens=300,top_p=0.8) The answer tends to have repetitive text you can use repetition_penalty to prevent repetitive text. And your SamplingParams means your result will have andomness.

ArlanCooper commented 9 months ago

check your SamplingParams.

SamplingParams(temperature=0.1,max_tokens=300,top_p=0.8) The answer tends to have repetitive text you can use repetition_penalty to prevent repetitive text. And your SamplingParams means your result will have andomness.

thanks, I will have a try

qingjiaozyn commented 9 months ago

check your SamplingParams.

SamplingParams(temperature=0.1,max_tokens=300,top_p=0.8) The answer tends to have repetitive text you can use repetition_penalty to prevent repetitive text. And your SamplingParams means your result will have andomness.

The parameters I use are the same as those of qwen. SamplingParams(temperature=0.01, max_tokens=2048, stop=["<|im_end|>", "<|endoftext|>"])

ArlanCooper commented 9 months ago

stop=["<|im_end|>", "<|endoftext|>"]

yeah, i have set it ,but the answer still not the same like huggingface results, and the result is not good as HF, do you know the reason?

shiqingzhangCSU commented 9 months ago

stop=["<|im_end|>", "<|endoftext|>"]

yeah, i have set it ,but the answer still not the same like huggingface results, and the result is not good as HF, do you know the reason?

你要对比两边的差异,最好都用greadysearch去解码。If you want to compare the differences between the two sides, it is best to use greedysearch to decode them.

ArlanCooper commented 9 months ago

stop=["<|im_end|>", "<|endoftext|>"]

yeah, i have set it ,but the answer still not the same like huggingface results, and the result is not good as HF, do you know the reason?

你要对比两边的差异,最好都用greadysearch去解码。If you want to compare the differences between the two sides, it is best to use greedysearch to decode them.

ok, thanks, i will have a try

qingjiaozyn commented 9 months ago

stop=["<|im_end|>", "<|endoftext|>"]

yeah, i have set it ,but the answer still not the same like huggingface results, and the result is not good as HF, do you know the reason?

你要对比两边的差异,最好都用greadysearch去解码。If you want to compare the differences between the two sides, it is best to use greedysearch to decode them.

Have you compared the effect of VLLM integration with QWEN, and have there been any inconsistencies in the answers? What is the startup parameter configuration?

ArlanCooper commented 8 months ago

stop=["<|im_end|>", "<|endoftext|>"]

yeah, i have set it ,but the answer still not the same like huggingface results, and the result is not good as HF, do you know the reason?

你要对比两边的差异,最好都用greadysearch去解码。If you want to compare the differences between the two sides, it is best to use greedysearch to decode them.

请问这一块有参数可以设置吗?还是需要自己去找到源码,自己去修改呢?

shiqingzhangCSU commented 8 months ago

stop=["<|im_end|>", "<|endoftext|>"]

yeah, i have set it ,but the answer still not the same like huggingface results, and the result is not good as HF, do you know the reason?

你要对比两边的差异,最好都用greadysearch去解码。If you want to compare the differences between the two sides, it is best to use greedysearch to decode them.

请问这一块有参数可以设置吗?还是需要自己去找到源码,自己去修改呢?

see the code

github-actions[bot] commented 2 weeks ago

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!