Closed ram-cherukuri closed 1 year ago
This is great!
I am learning to program dla with cudla in standalone mode by cuda samples cuDLAStandaloneMode.
The loadable bin is created by tensorrt with cmd: trtexec --deploy=/usr/src/tensorrt/data/resnet50/ResNet50_N2.prototxt --model=/usr/src/tensorrt/data/resnet50/ResNet50_fp32.caffemodel --output=fc1000 --useDLACore=0 --int8 --memPoolSize=dlaSRAM:1 --inputIOFormats=int8:chw --outputIOFormats=int8:chw --saveEngine=./resnet_50_int8_chw.bin --buildOnly --safe
The dtype of inputs and outputs of loadable bin are int8 and the original mode's are fp32. In the sample mentioned above, there is no code about how to pre-process the fp32 input to int8 and post-process the int8 output to float32.
So, can you post a sample to demonstrate how to process fp32 input fed to dla int8 model?
Please use this as a forum to tell us what types of samples would be most useful to you for leveraging DLA effectively in your application development. We will try our best to address the requests.