Open lix19937 opened 4 months ago
title | 备注 |
---|---|
Driveworks5.0-camera_replay | |
Driveworks5.0-cgf | |
Driveworks5.0-make user demo | |
Engine file by trtexec different from user's cpp project | |
Floating point computing capacity not match with Orin-x's datasheet | |
GPU Compute Time in trtexec's output | |
How is the performance of einsum operator | |
How to debug foreign node | |
How to do concat op quantization in qat | |
How to further optimize batched gridsample2d | |
How to get FLOP | |
How to improve ptq accuracy for model with plugin | |
How to improve the point cloud registration accuracy of CUDA ICP | |
How to obtain DLA resource usage information | |
Issue of sample_cgf_dwchannel | |
Multi streams sync(stream capture) in enqueue | |
Myelin 700 error in trtexec inference | |
Myelin foreign node take much time | |
Only 11 arm cores and 4MB L3 cache observed in orin-Devkit | |
Orin DevKit can not mount u disk | |
Orin Devkit can not reboot | |
Orin DevKit report Illegal instruction with gdb | |
Orin DevKit trtexec build plan error | |
Some problems about dynamic graph(torch) convert TRT plan | |
Some problems of QAT sample | |
Some tests about cuda memory selection | |
Technical exchange of transformer-based algorithm porting with TensorRT | |
Technical exchange on cudaGraph | |
TensorRT Sparse Convolution Support | |
The forward inference of dcnv2 implemented with CUDA takes too long | |
Time consumption of QAT | |
TRT dynamic batch h w infer | |
trtexec convert onnx error | |
trtexec int8 time no less than fp32 | |
Using multiple streams does not achieve concurrent execution | |
Why after reformat, data is not change | |
Why does the same fusion method take nearly twice as long between qat and ptq | |
Why QAT2PTQ tool not work |