NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
https://developer.nvidia.com/tensorrt
Apache License 2.0
10.72k stars 2.12k forks source link

flux model engine_from_bytes(bytes_from_path(self.engine_path)) OutOfMemory #4207

Open algorithmconquer opened 1 week ago

algorithmconquer commented 1 week ago

from polygraphy.backend.trt import engine_from_bytes; when run engine_from_bytes(bytes_from_path(self.engine_path)) OutOfMemory on L40 with 1gpu with flux-dev,how to solve

lix19937 commented 1 week ago

Try to use trtexec, trt version >=8.6

yuanyao-nv commented 1 week ago

Related issue: https://github.com/NVIDIA/TensorRT/issues/4205