NVIDIA / apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
BSD 3-Clause "New" or "Revised" License
8.42k stars 1.4k forks source link

[Questing] For apex sparsity model, when i export trt engine with flag sparsity=enable or force, only partial layer picked sparse implementation. #1780

Closed Bobo-y closed 8 months ago

Bobo-y commented 8 months ago

image above layer all traing with sparsity mask, but when convert to trt engine, only 3 layer can run with sparsity.

image

Bobo-y commented 8 months ago

The reason is roughly as follows: although the weights of the above layers are sparse, TRT found that implementing without sparsity is better than using sparsity. Therefore, even if the weights of some layers are sparse, sparse implementation will still not be used.

laurenlong commented 3 months ago

Can I ask what settings are needed to see this output? I set . /trtexec --onnx=model_weights_sparse.onnx --saveEngine=model_weights_sparse.trt --sparsity=enable --fp16 >result-model_weights_sparse-fp16.txt 2>&1 . But I don't see a similar output. Thank you!

Bobo-y commented 3 months ago

Can I ask what settings are needed to see this output? I set . /trtexec --onnx=model_weights_sparse.onnx --saveEngine=model_weights_sparse.trt --sparsity=enable --fp16 >result-model_weights_sparse-fp16.txt 2>&1 . But I don't see a similar output. Thank you!

may be use --verbose

laurenlong commented 3 months ago

Thank you.