Open chhzh123 opened 1 year ago
We have just released our implementation, and you can view an example of how to use it here: https://github.com/spcl/sten/blob/f2a5aa05b510f3910e15343af1560c0f938b94a2/tests/test_nmg.py#L6-L30
It seems that ur code is baed on CPU. I tried to time line 30 and benchmark it against output = model(input), the dense version. It seems a lot slower. In addition, how can I make it to GPU? calling .cuda() on model and input?
Hi, thanks for open-sourcing the great work, which is very helpful for sparse deep learning workloads. I notice there is a
𝑛:𝑚:𝑔
sparsity layout in your paper, but I could not find theGroupedNMSparsifier
class in this repository. Could you kindly point me to that implementation?You also mentioned "CPU implementations for
𝑛:𝑚:𝑔
sparsity were compiled with GCC 8.4", but it seems this repository only contains the Python code. Will you release the kernel implementation later?