Closed Tushar-ml closed 3 weeks ago
Could you try export NCCL_LAUNCH_MODE=GROUP
before running the inference example?
It is same, I am running on tritonserver. It is happening when increasing TP, for example till TP=2 for llama3 8B it gives consistent result, but on TP=4 it started giving different responses
Can you provide the reproducible demo?
I didn't reproduce this issue. My test code is as follows:
import time
from lmdeploy import pipeline, TurbomindEngineConfig, GenerationConfig
model_path = "/workspace/llama3.1/Meta-Llama-3.1-8B-Instruct-AWQ"
start = time.perf_counter()
backend_config = TurbomindEngineConfig(
max_batch_size=1,
cache_max_entry_count=0.5,
)
pipe = pipeline(model_path, backend_config=backend_config, log_level='ERROR')
end = time.perf_counter()
print(f'building pipeline cost: {end - start} s')
prompt = "Janet’s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?\nPlease reason step by step, and put your final answer within \\boxed{}.\n"
gen_config = GenerationConfig(temperature=0.0)
for i in range(10):
print('-'*50)
response = pipe(prompt, gen_config=gen_config)
print(response.text)
@lvhan028 It is happening for TP>1
I didn't reproduce it.
I tested it in openmmlab/lmdeploy docker, in which NCCL_LAUNCH_MODE
is GROUP
@lvhan028 It is changing when prompt got changed. for your prompt it is working, when I change to "write an essay on open-source", it is giving below response
The Open-Source Movement: A Revolution in Software Development
The open-source movement has been a game-changer in the software development industry, allowing developers to collaborate and create high-quality software without the need for a single proprietary license. Open-source software is free to use, modify, and distribute, and has become a vital part of many industries, including operating systems, web browsers, and even social media platforms.
The concept of open-source dates back to the 1980s, when Richard Stallman, a graduate student at MIT, coined the term "copyleft" to describe the practice of sharing and modifying software code. However, it wasn't until the late 1990s and early 2000s that open-source software started to gain mainstream popularity.
One of the key factors that contributed to the growth of open-source was the rise of the internet and the World Wide Web. With the widespread adoption of the web, developers could now easily share and collaborate on software projects, and the open-source model became a natural fit. Another important factor was the emergence of Linux, a free and open-source operating system that gained popularity in the late 1990s.
Linux, developed by Linus Torvalds and others, was initially created as a hobby project, but it quickly gained traction and became a viable alternative to proprietary operating systems like Windows and MacOS. Linux's open-source model allowed developers to contribute code, fix bugs, and improve the operating system, making it a highly reliable and efficient platform.
The open-source model has several benefits, including cost savings, increased collaboration, and faster development cycles. With open-source software, developers can use and modify existing code without the need for a license, which reduces costs and allows for faster development. Additionally, the open-source model encourages collaboration, as developers can work together to create high-quality software, and the community can benefit from the collective efforts.
Another significant advantage of open-source software is the ability to fix bugs and improve the code. With proprietary software, bugs and issues are often fixed by the original developers, and the code is not made available to the public. In contrast, open-source software allows developers to fix bugs and improve the code, making it a more reliable and efficient platform.
The open-source model has also led to the creation of many successful projects, including Apache, Firefox, and WordPress. These projects have become essential tools for many industries, including web development, and have enabled developers to create high-quality software without the need for a license.
The concept of open-source refers to the practice of making the source code of a program or software available to the public, usually free or at a low cost, and allowing users to modify it, distribute it, and use it as they see fit. This approach has been gaining popularity in recent years, particularly in the field of software development, and has led to the creation of many successful projects, such as Linux, Apache, and Firefox.
The idea of open-source is rooted in the concept of free and open-source software, which emerged in the 1980s. At that time, many software developers, including Richard Stallman, Linus Torvalds, and Eric Raymond, were working on projects that were not only free but also open-source, meaning that the source code was available to the public and could be modified and distributed by anyone.
The term "open-source" was first used in 1998 by Eric Raymond, who wrote an essay titled "The Open Source Definition" in which he defined open-source as "a philosophy and a way of making a difference in the world by creating free and open-source software." This essay was widely read and shared, and it helped to popularize the concept of open-source and its benefits.
One of the main advantages of open-source is that it allows developers to work together and collaborate on projects, which can lead to faster development and better quality of the software. This is because open-source software is often developed by a community of developers who contribute to the project, and the source code is available to the public, which means that anyone can use it, modify it, and distribute it.
Another advantage of open-source is that it allows companies to save money and time by using existing software, which is often developed by a community of developers, rather than having to develop it themselves. This is because open-source software is often available for free or at a low cost, and companies can use it without having to spend a lot of money and time developing it themselves.
In addition, open-source software is often more secure than proprietary software, which is developed by a single company or a small group of developers. This is because open-source software is often developed by a community of developers, which means that many eyes are watching the code, and it is more likely to be secure and reliable.
However, there are also some challenges and limitations of open-source software. One of the main challenges is that it can be difficult to find and fix bugs, which can be a problem because open-source software is often developed by a community of
It is on Tp=4
Any update on this @lvhan028 @lzhangzz
Can you share the reproducible code?
import time
from lmdeploy import pipeline, TurbomindEngineConfig, GenerationConfig
model_path = "/workspace/llama3.1/Meta-Llama-3.1-8B-Instruct-AWQ"
start = time.perf_counter()
backend_config = TurbomindEngineConfig(
max_batch_size=1,
cache_max_entry_count=0.5, tp = 4
)
pipe = pipeline(model_path, backend_config=backend_config, log_level='ERROR')
end = time.perf_counter()
print(f'building pipeline cost: {end - start} s')
prompt = "write an article on open-source"
gen_config = GenerationConfig(temperature=0.0)
for i in range(10):
print('-'*50)
response = pipe(prompt, gen_config=gen_config)
print(response.text)
@lvhan028 I am not getting issue now after your PR #2090.
Thanks a lot
Checklist
Describe the bug
As titled, llama3.1 converted to AWQ format when running with tp=2, giving different responses for temperature 0.0 and top_p = 0, but giving correct and deterministic responses at tp=1
llama3 it is working correct, maybe ROPE issue
Reproduction
lmdeploy lite auto_awq meta-llama/Meta-Llama-3.1-8B-Instruct --work-dir llama3_1_awq then used example shown here at tp=2: https://github.com/InternLM/lmdeploy/blob/main/docs/en/quantization/w4a16.md
Environment
Error traceback
No response