[REF] Use alternative approach where kernel is repeated C_in * N times

f-dangel / unfoldNd

(N=1,2,3)-dimensional unfold (im2col) and fold (col2im) in PyTorch

MIT License

84 stars 6 forks source link

[REF] Use alternative approach where kernel is repeated C_in * N times #14

Closed f-dangel closed 3 years ago

f-dangel commented 3 years ago

This is an alternative approach for #12 which uses more groups, as the originally proposed optimization (using one group) deteriorated performance.

[x] Merge, then re-run the benchmark
[x] After inspecting performance, decide which approach is the best: 1, C_in, or C_in * N groups

f-dangel commented 3 years ago

Run time and memory are similar for C_in and C_in * N groups. I will stick with the approach that uses C_in groups for now, because (i) the one-hot kernel is N times smaller, and (ii) the code does not require one more reshape.