NVIDIA / apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
BSD 3-Clause "New" or "Revised" License
8.16k stars 1.35k forks source link

[Questing] For apex sparsity model, when i export trt engine with flag sparsity=enable or force, only partial layer picked sparse implementation. #1780

Closed Bobo-y closed 4 months ago

Bobo-y commented 4 months ago

image above layer all traing with sparsity mask, but when convert to trt engine, only 3 layer can run with sparsity.

image

Bobo-y commented 4 months ago

The reason is roughly as follows: although the weights of the above layers are sparse, TRT found that implementing without sparsity is better than using sparsity. Therefore, even if the weights of some layers are sparse, sparse implementation will still not be used.