How to use sparse weights for inference acceleration

aojunzz / NM-sparsity

214 stars 29 forks source link

How to use sparse weights for inference acceleration #15

Closed SummerParadise-0922 closed 2 years ago

SummerParadise-0922 commented 2 years ago

Thank you for your excellent work! I used your proposed SR-STE to train my model with only negligible performance drop. Now I tried to test the actual acceleration on the RTX 30 series. However I didn't find a convenient way. Could you provide code examples for inference?

aojunzz commented 2 years ago

Thanks for your question,

If you want to have inference acceleration, you should deploy your model on the new TRT, we don't try to deploy our model on TRT. Specifically, you can refer to https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#trtexec-flags.