NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
https://nvidia.github.io/TensorRT-LLM
Apache License 2.0
8.74k stars 1k forks source link

Error in benchmarks/python/all_reduce.py #2386

Closed wpybtw closed 3 weeks ago

wpybtw commented 3 weeks ago

System Info

Intel(R) Xeon(R) Platinum 8458P, H20 GPU *8 Nvidia pytorch docker Release 24.08 (build 107063150)

tensorrt 10.4.0 tensorrt-cu12 10.4.0 tensorrt-cu12-bindings 10.4.0 tensorrt-cu12-libs 10.4.0 tensorrt_llm 0.15.0.dev2024102200 typing_extensions 4.8.0

Who can help?

No response

Information

Tasks

Reproduction

Add init_all_reduce_helper() in l70. And run

mpirun -n 8 --allow-run-as-root python all_reduce.py

Expected behavior

Run allreduce

actual behavior

[TensorRT-LLM] TensorRT-LLM version: 0.15.0.dev2024102200
[TensorRT-LLM] TensorRT-LLM version: 0.15.0.dev2024102200
[TensorRT-LLM] TensorRT-LLM version: 0.15.0.dev2024102200
[TensorRT-LLM] TensorRT-LLM version: 0.15.0.dev2024102200
[TensorRT-LLM] TensorRT-LLM version: 0.15.0.dev2024102200
[TensorRT-LLM] TensorRT-LLM version: 0.15.0.dev2024102200
[TensorRT-LLM] TensorRT-LLM version: 0.15.0.dev2024102200
[TensorRT-LLM] TensorRT-LLM version: 0.15.0.dev2024102200
world_size     , dtype     , message size   , strategy       , duration (ms)
[10/29/2024-06:51:44] [TRT] [E] IBuilder::buildSerializedNetwork: Error Code 3: API Usage Error (Parameter check failed, condition: !config.getFlag(BuilderFlag::kOBEY_PRECISION_CONSTRAINTS). )
[10/29/2024-06:51:44] [TRT] [E] IBuilder::buildSerializedNetwork: Error Code 3: API Usage Error (Parameter check failed, condition: !config.getFlag(BuilderFlag::kOBEY_PRECISION_CONSTRAINTS). )
[10/29/2024-06:51:44] [TRT] [E] IBuilder::buildSerializedNetwork: Error Code 3: API Usage Error (Parameter check failed, condition: !config.getFlag(BuilderFlag::kOBEY_PRECISION_CONSTRAINTS). )
[10/29/2024-06:51:44] [TRT] [E] IBuilder::buildSerializedNetwork: Error Code 3: API Usage Error (Parameter check failed, condition: !config.getFlag(BuilderFlag::kOBEY_PRECISION_CONSTRAINTS). )
[!] Invalid Engine. Please ensure the engine was built correctly
Traceback (most recent call last):
  File "/home/wpy/TensorRT-LLMnew/benchmarks/python/all_reduce.py", line 144, in <module>
    allreduce_benchmark(args.dtype, args.range, args.no_header)
  File "/home/wpy/TensorRT-LLMnew/benchmarks/python/all_reduce.py", line 105, in allreduce_benchmark
    session = tllm.runtime.Session.from_engine(build_engine())
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/base/loader.py", line 40, in __call__
    return self.call_impl(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 624, in call_impl
    return engine_from_bytes(super().call_impl, runtime=self._runtime)
  File "<string>", line 3, in engine_from_bytes
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/base/loader.py", line 40, in __call__
    return self.call_impl(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 653, in call_impl
    buffer, _ = util.invoke_if_callable(self._serialized_engine)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 678, in invoke_if_callable
    ret = func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 557, in call_impl
[!] Invalid Engine. Please ensure the engine was built correctly
    G_LOGGER.critical("Invalid Engine. Please ensure the engine was built correctly")
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/logger/logger.py", line 605, in critical
Traceback (most recent call last):
  File "/home/wpy/TensorRT-LLMnew/benchmarks/python/all_reduce.py", line 144, in <module>
    raise ExceptionType(message) from None
polygraphy.exception.exception.PolygraphyException: Invalid Engine. Please ensure the engine was built correctly
    allreduce_benchmark(args.dtype, args.range, args.no_header)
  File "/home/wpy/TensorRT-LLMnew/benchmarks/python/all_reduce.py", line 105, in allreduce_benchmark
    session = tllm.runtime.Session.from_engine(build_engine())
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/base/loader.py", line 40, in __call__
    return self.call_impl(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 624, in call_impl
    return engine_from_bytes(super().call_impl, runtime=self._runtime)
  File "<string>", line 3, in engine_from_bytes
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/base/loader.py", line 40, in __call__
    return self.call_impl(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 653, in call_impl
[10/29/2024-06:51:44] [TRT] [E] IBuilder::buildSerializedNetwork: Error Code 3: API Usage Error (Parameter check failed, condition: !config.getFlag(BuilderFlag::kOBEY_PRECISION_CONSTRAINTS). )
    buffer, _ = util.invoke_if_callable(self._serialized_engine)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 678, in invoke_if_callable
    ret = func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 557, in call_impl
    G_LOGGER.critical("Invalid Engine. Please ensure the engine was built correctly")
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/logger/logger.py", line 605, in critical
    raise ExceptionType(message) from None
polygraphy.exception.exception.PolygraphyException: Invalid Engine. Please ensure the engine was built correctly
[10/29/2024-06:51:44] [TRT] [E] IBuilder::buildSerializedNetwork: Error Code 3: API Usage Error (Parameter check failed, condition: !config.getFlag(BuilderFlag::kOBEY_PRECISION_CONSTRAINTS). )
[!] Invalid Engine. Please ensure the engine was built correctly
[!] Invalid Engine. Please ensure the engine was built correctly
Traceback (most recent call last):
  File "/home/wpy/TensorRT-LLMnew/benchmarks/python/all_reduce.py", line 144, in <module>
Traceback (most recent call last):
  File "/home/wpy/TensorRT-LLMnew/benchmarks/python/all_reduce.py", line 144, in <module>
    allreduce_benchmark(args.dtype, args.range, args.no_header)
  File "/home/wpy/TensorRT-LLMnew/benchmarks/python/all_reduce.py", line 105, in allreduce_benchmark
    session = tllm.runtime.Session.from_engine(build_engine())
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/base/loader.py", line 40, in __call__
    allreduce_benchmark(args.dtype, args.range, args.no_header)
  File "/home/wpy/TensorRT-LLMnew/benchmarks/python/all_reduce.py", line 105, in allreduce_benchmark
    return self.call_impl(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
    session = tllm.runtime.Session.from_engine(build_engine())
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/base/loader.py", line 40, in __call__
    return self.call_impl(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 624, in call_impl
[10/29/2024-06:51:44] [TRT] [E] IBuilder::buildSerializedNetwork: Error Code 3: API Usage Error (Parameter check failed, condition: !config.getFlag(BuilderFlag::kOBEY_PRECISION_CONSTRAINTS). )
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 624, in call_impl
    return engine_from_bytes(super().call_impl, runtime=self._runtime)
  File "<string>", line 3, in engine_from_bytes
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/base/loader.py", line 40, in __call__
    return engine_from_bytes(super().call_impl, runtime=self._runtime)
  File "<string>", line 3, in engine_from_bytes
    return self.call_impl(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/base/loader.py", line 40, in __call__
    return self.call_impl(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 653, in call_impl
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 653, in call_impl
    buffer, _ = util.invoke_if_callable(self._serialized_engine)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 678, in invoke_if_callable
    buffer, _ = util.invoke_if_callable(self._serialized_engine)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 678, in invoke_if_callable
    ret = func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
    ret = func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 557, in call_impl
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 557, in call_impl
    G_LOGGER.critical("Invalid Engine. Please ensure the engine was built correctly")
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/logger/logger.py", line 605, in critical
    G_LOGGER.critical("Invalid Engine. Please ensure the engine was built correctly")
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/logger/logger.py", line 605, in critical
    raise ExceptionType(message) from None
polygraphy.exception.exception.PolygraphyException: Invalid Engine. Please ensure the engine was built correctly
    raise ExceptionType(message) from None
polygraphy.exception.exception.PolygraphyException: Invalid Engine. Please ensure the engine was built correctly
[10/29/2024-06:51:44] [TRT] [E] IBuilder::buildSerializedNetwork: Error Code 3: API Usage Error (Parameter check failed, condition: !config.getFlag(BuilderFlag::kOBEY_PRECISION_CONSTRAINTS). )
[!] Invalid Engine. Please ensure the engine was built correctly
Traceback (most recent call last):
  File "/home/wpy/TensorRT-LLMnew/benchmarks/python/all_reduce.py", line 144, in <module>
    allreduce_benchmark(args.dtype, args.range, args.no_header)
  File "/home/wpy/TensorRT-LLMnew/benchmarks/python/all_reduce.py", line 105, in allreduce_benchmark
    session = tllm.runtime.Session.from_engine(build_engine())
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/base/loader.py", line 40, in __call__
    return self.call_impl(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 624, in call_impl
    return engine_from_bytes(super().call_impl, runtime=self._runtime)
  File "<string>", line 3, in engine_from_bytes
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/base/loader.py", line 40, in __call__
[!] Invalid Engine. Please ensure the engine was built correctly
    return self.call_impl(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 653, in call_impl
    buffer, _ = util.invoke_if_callable(self._serialized_engine)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 678, in invoke_if_callable
Traceback (most recent call last):
  File "/home/wpy/TensorRT-LLMnew/benchmarks/python/all_reduce.py", line 144, in <module>
    ret = func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 557, in call_impl
    allreduce_benchmark(args.dtype, args.range, args.no_header)
  File "/home/wpy/TensorRT-LLMnew/benchmarks/python/all_reduce.py", line 105, in allreduce_benchmark
    G_LOGGER.critical("Invalid Engine. Please ensure the engine was built correctly")
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/logger/logger.py", line 605, in critical
    session = tllm.runtime.Session.from_engine(build_engine())
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/base/loader.py", line 40, in __call__
    return self.call_impl(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
    raise ExceptionType(message) from None
polygraphy.exception.exception.PolygraphyException: Invalid Engine. Please ensure the engine was built correctly
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 624, in call_impl
    return engine_from_bytes(super().call_impl, runtime=self._runtime)
  File "<string>", line 3, in engine_from_bytes
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/base/loader.py", line 40, in __call__
    return self.call_impl(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 653, in call_impl
    buffer, _ = util.invoke_if_callable(self._serialized_engine)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 678, in invoke_if_callable
    ret = func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
[!] Invalid Engine. Please ensure the engine was built correctly
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 557, in call_impl
    G_LOGGER.critical("Invalid Engine. Please ensure the engine was built correctly")
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/logger/logger.py", line 605, in critical
    raise ExceptionType(message) from None
polygraphy.exception.exception.PolygraphyException: Invalid Engine. Please ensure the engine was built correctly
Traceback (most recent call last):
  File "/home/wpy/TensorRT-LLMnew/benchmarks/python/all_reduce.py", line 144, in <module>
    allreduce_benchmark(args.dtype, args.range, args.no_header)
  File "/home/wpy/TensorRT-LLMnew/benchmarks/python/all_reduce.py", line 105, in allreduce_benchmark
    session = tllm.runtime.Session.from_engine(build_engine())
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/base/loader.py", line 40, in __call__
    return self.call_impl(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 624, in call_impl
    return engine_from_bytes(super().call_impl, runtime=self._runtime)
  File "<string>", line 3, in engine_from_bytes
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/base/loader.py", line 40, in __call__
    return self.call_impl(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
[!] Invalid Engine. Please ensure the engine was built correctly
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 653, in call_impl
Traceback (most recent call last):
  File "/home/wpy/TensorRT-LLMnew/benchmarks/python/all_reduce.py", line 144, in <module>
    buffer, _ = util.invoke_if_callable(self._serialized_engine)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 678, in invoke_if_callable
    allreduce_benchmark(args.dtype, args.range, args.no_header)
  File "/home/wpy/TensorRT-LLMnew/benchmarks/python/all_reduce.py", line 105, in allreduce_benchmark
    ret = func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
    session = tllm.runtime.Session.from_engine(build_engine())
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/base/loader.py", line 40, in __call__
    return self.call_impl(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 557, in call_impl
    G_LOGGER.critical("Invalid Engine. Please ensure the engine was built correctly")
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/logger/logger.py", line 605, in critical
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 624, in call_impl
    raise ExceptionType(message) from None
polygraphy.exception.exception.PolygraphyException: Invalid Engine. Please ensure the engine was built correctly
    return engine_from_bytes(super().call_impl, runtime=self._runtime)
  File "<string>", line 3, in engine_from_bytes
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/base/loader.py", line 40, in __call__
    return self.call_impl(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 653, in call_impl
    buffer, _ = util.invoke_if_callable(self._serialized_engine)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 678, in invoke_if_callable
    ret = func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/util/util.py", line 710, in wrapped
    return func(*args, **kwargs)
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/backend/trt/loader.py", line 557, in call_impl
    G_LOGGER.critical("Invalid Engine. Please ensure the engine was built correctly")
  File "/home/wpy/miniconda3/envs/build/lib/python3.10/site-packages/polygraphy/logger/logger.py", line 605, in critical
    raise ExceptionType(message) from None
polygraphy.exception.exception.PolygraphyException: Invalid Engine. Please ensure the engine was built correctly
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[23201,1],4]
  Exit code:    1

additional notes

None

VALLIS-NERIA commented 3 weeks ago

Hi @wpybtw , you can use the following patch to fix it temporarily.

From 06bc52864061da8a9583fe2ddcba38c1e58ee8ca Mon Sep 17 00:00:00 2001
From: Xiwen Yu <xiweny@nvidia.com>
Date: Wed, 30 Oct 2024 09:17:06 +0000
Subject: [PATCH] fix all_reduce benchmark

---
 benchmarks/python/all_reduce.py | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/benchmarks/python/all_reduce.py b/benchmarks/python/all_reduce.py
index d91cdd0d4..92f332762 100644
--- a/benchmarks/python/all_reduce.py
+++ b/benchmarks/python/all_reduce.py
@@ -25,7 +25,8 @@ import tensorrt_llm as tllm
 from tensorrt_llm import Mapping, Tensor
 from tensorrt_llm._utils import OMPI_COMM_TYPE_HOST, mpi_comm
 from tensorrt_llm.functional import AllReduceStrategy, allreduce
-from tensorrt_llm.plugin.plugin import current_all_reduce_helper
+from tensorrt_llm.plugin.plugin import (current_all_reduce_helper,
+                                        init_all_reduce_helper)

 def allreduce_benchmark(dtype: str,
@@ -41,7 +42,7 @@ def allreduce_benchmark(dtype: str,
     torch.cuda.set_device(local_rank)
     cudart.cudaSetDevice(local_rank)

-    mapping = Mapping(world_size, rank, gpus_per_node, world_size)
+    mapping = Mapping(world_size, rank, gpus_per_node, tp_size=world_size)

     if world_size == 1:
         raise RuntimeError("Benchmark must run with mpi_world_size > 1")
@@ -50,6 +51,7 @@ def allreduce_benchmark(dtype: str,
     min_size, max_size, ratio = [int(i) for i in test_range.split(",")]
     inner_loop = 1000

+    init_all_reduce_helper()
     size = min_size
     dtype_size = torch.finfo(torch_dtype).bits // 8
     if mapping.rank == 0 and not no_header:
@@ -89,12 +91,7 @@ def allreduce_benchmark(dtype: str,
                 output.dtype = tllm.str_dtype_to_trt(dtype)

             build_engine = EngineFromNetwork(
-                (builder.trt_builder, net.trt_network),
-                config=CreateConfig(
-                    fp16=(dtype == 'float16'),
-                    bf16=(dtype == 'bfloat16'),
-                    precision_constraints='obey',
-                ))
+                (builder.trt_builder, net.trt_network), config=CreateConfig())

             output = torch.zeros_like(input)

--
2.34.1
wpybtw commented 3 weeks ago

Great, it works. Thanks