[Questing] For apex sparsity model, when i export trt engine with flag sparsity=enable or force, only partial layer picked sparse implementation.

NVIDIA / apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

BSD 3-Clause "New" or "Revised" License

8.16k stars 1.35k forks source link

[Questing] For apex sparsity model, when i export trt engine with flag sparsity=enable or force, only partial layer picked sparse implementation. #1780

Closed Bobo-y closed 4 months ago

Bobo-y commented 4 months ago

above layer all traing with sparsity mask, but when convert to trt engine, only 3 layer can run with sparsity.

Bobo-y commented 4 months ago

The reason is roughly as follows: although the weights of the above layers are sparse, TRT found that implementing without sparsity is better than using sparsity. Therefore, even if the weights of some layers are sparse, sparse implementation will still not be used.