[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.
1.22k
stars
143
forks
source link
[BUG] <numBlocks in Y dimension is larger than needed for FetchOnDemand_no_fusion> #323
Open
yokosyun opened 3 months ago
Is there an existing issue for this?
Current Behavior
fetch_on_demand_gemm_no_fusion have wrong numBlocks in Y dim. Thus there is unnecessary Block execution.
cur_nnz is divided by 16(BLOCK_SIZE)
Expected Behavior
it must be divided by (16(BLOCK_SIZE)*4(N_LOOP)) to be correct numBlocks in Y dim
Environment
Anything else?
We can't make a bugfix PR?