lix19937 / tensorrt-insight

Deep insight tensorrt, including but not limited to qat, ptq, plugin, triton_inference, cuda
12 stars 0 forks source link

bug list #29

Open lix19937 opened 4 months ago

lix19937 commented 4 months ago
title 备注
Driveworks5.0-camera_replay
Driveworks5.0-cgf
Driveworks5.0-make user demo
Engine file by trtexec different from user's cpp project
Floating point computing capacity not match with Orin-x's datasheet
GPU Compute Time in trtexec's output
How is the performance of einsum operator
How to debug foreign node
How to do concat op quantization in qat
How to further optimize batched gridsample2d
How to get FLOP
How to improve ptq accuracy for model with plugin
How to improve the point cloud registration accuracy of CUDA ICP
How to obtain DLA resource usage information
Issue of sample_cgf_dwchannel
Multi streams sync(stream capture) in enqueue
Myelin 700 error in trtexec inference
Myelin foreign node take much time
Only 11 arm cores and 4MB L3 cache observed in orin-Devkit
Orin DevKit can not mount u disk
Orin Devkit can not reboot
Orin DevKit report Illegal instruction with gdb
Orin DevKit trtexec build plan error
Some problems about dynamic graph(torch) convert TRT plan
Some problems of QAT sample
Some tests about cuda memory selection
Technical exchange of transformer-based algorithm porting with TensorRT
Technical exchange on cudaGraph
TensorRT Sparse Convolution Support
The forward inference of dcnv2 implemented with CUDA takes too long
Time consumption of QAT
TRT dynamic batch h w infer
trtexec convert onnx error
trtexec int8 time no less than fp32
Using multiple streams does not achieve concurrent execution
Why after reformat, data is not change
Why does the same fusion method take nearly twice as long between qat and ptq
Why QAT2PTQ tool not work