Open CallmeZhangChenchen opened 2 weeks ago
[08/26/2024-08:05:39] [W] [TRT] Engine generation failed with backend strategy 4.
Error message: [randomFill.cpp::replaceFillNodesForMyelin::89] Error Code 2: Internal Error (Assertion node->backend == Backend::kMYELIN failed. ).
Skipping this backend strategy.
There was a warring when the model was converted
I think I found out why I'll take the time to study it
According to the issue, the problem seems to be with the node that was offloaded to one of our backend DL graph compilers so we can investigate it internally. Can you confirm the source of the screenshot showing the ForeignNode?
I think I found out why I'll take the time to study it @moraxu Thanks for your attention.
Using nsys profile -o analysis_test trtexec *** , I exported an analysis file and then opened it with Nsight. There was a time-consuming operation that took 1.6s
The main time consuming point is between input op pitchf and /dec/m_source/l_tanh/Tanh, so my solution is to move this part off the network for now and use torch to reason
@CallmeZhangChenchen , sorry for the late follow up - is this on Windows 10 or 11? If not, could you provide a specific OS version for us to reproduce?
@moraxu Thanks! OS version: Ubuntu 22.04.4 LTS
I've instanced an internal bug, thank you.
@CallmeZhangChenchen , could you provide pytorch inference script as well? The issue is about comparison with pytorch, could be that TRT has bug or could be that the pytorch script is not actually doing the same workload.
Could you also provide the full trtexec --verbose
log from your end, if possible?
@moraxu I may not be able to provide a complete running pytorch script, because I have optimized the code here, and now the model has been reduced from 800ms to 27ms.
The original pytorch project, https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI
Transferring onnx script may not be so smooth, https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/blob/main/infer/modules/onnx/export.py
trtexec --verbose log https://drive.google.com/file/d/1Uc_m2gP9QhjussV-rkJRsLPp7AdE2XLE/view?usp=drive_link
Get rid of time-consuming code, https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/blob/main/infer/lib/infer_pack/models_onnx.py
def forward(self, x, upp=None):
#sine_wavs, uv, _ = self.l_sin_gen(x, upp)
#if self.is_half:
# sine_wavs = sine_wavs.half()
#sine_merge = self.l_tanh(self.l_linear(sine_wavs))
sine_merge = self.l_tanh(self.l_linear(x))
return sine_merge, None, None # noise, uv
This part takes a few ms using pytorch
Cropped onnx, https://drive.google.com/file/d/1ucjIDLpJfOMFIWVY8NKav6fa05KF4icd/view?usp=drive_link
Thank you, I'll pass the info on
Description
Under the same conditions, my model inference speed tensort is several times slower than pytorch
Environment
TensorRT Version: TensorRT.trtexec [TensorRT v100300]
NVIDIA GPU: A30 & 4090
NVIDIA Driver Version: 535.104.05
CUDA Version: release 12.4, V12.4.131
CUDNN Version: **
Operating System:
Python Version (if applicable):
Tensorflow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if so, version):
Relevant Files
Model link:
https://drive.google.com/file/d/1V3wZFEyO6s3szE6tPhofa-bkY0Lqwu8M/view?usp=drive_link
Steps To Reproduce
pytorch uses the same input/output size, plus pre and post processing, and only needs 300ms