NVIDIA / apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
BSD 3-Clause "New" or "Revised" License
8.2k stars 1.36k forks source link

The storage format of the compressed matrix in module 'apex.contrib.sparsity' #1706

Open Shan2L opened 11 months ago

Shan2L commented 11 months ago

Hi, I have some questions about ASP module: The document and related paper about N:M sparsity says that the matrices are compressed and the metedata are 2-bit . But I after using the ASP.prune_trained_model(model, optimizer, I saved the weights about my model. I found that the matrices did not be compressed and the metadata are bool-type.

I wonder where are the compressed matrices? and where is the 2-bit type metata? Or I need another step to compress my matrices?

Thanks a lot.

Bobo-y commented 4 months ago

@Shan2L for torch or onnx model, the weights not compressed, only when you use trtexec to build trt engine with flag sparsity=Enable