f-dangel / unfoldNd

(N=1,2,3)-dimensional unfold (im2col) and fold (col2im) in PyTorch
MIT License
82 stars 6 forks source link

+author: Felix Dangel

+title: unfoldNd: N-dimensional unfold in PyTorch

[[https://coveralls.io/repos/github/f-dangel/unfoldNd/badge.svg?branch=main]] [[https://img.shields.io/badge/python-3.8+-blue.svg]]

This package uses a numerical trick to perform the operations of [[https://pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.unfold][ ~torch.nn.functional.unfold~ ]] and [[https://pytorch.org/docs/stable/generated/torch.nn.Unfold.html][ ~torch.nn.Unfold~ ]], also known as ~im2col~. It extends them to higher-dimensional inputs that are currently not supported.

From the [[https://pytorch.org/docs/stable/generated/torch.nn.Unfold.html][PyTorch docs]]:

+begin_quote

Currently, only 4-D input tensors (batched image-like tensors) are supported.

+end_quote

~unfoldNd~ implements the operation for 3d and 5d inputs and shows good [[id:489186e1-e003-47e6-87df-5266592ff278][performance]].


News:

+begin_src python

pip install --user unfoldNd

+end_src

This package offers the following main functionality:

** Additional functionality (exotic)

Turned out the multi-dimensional generalization of [[https://pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.unfold][ ~torch.nn.functional.unfold~ ]] can be used to generalize [[https://pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.fold][ ~torch.nn.functional.fold~ ]],

exposed through

Keep in mind that, while tested, this feature is not benchmarked. However, sane performance can be expected, as it relies on N-dimensional unfold (benchmarked) and [[https://pytorch.org/docs/stable/generated/torch.scatter_add.html?highlight=scatter_add#torch.scatter_add][ ~torch.scatter_add~ ]].


Like input unfolding for convolutions, one can apply the same concept to the input of a transpose convolution. There is no comparable functionality for this in PyTorch as it is very exotic.

The following example explains input unfolding for transpose convolutions by demonstrating the connection to transpose convolution as matrix multiplication.

This functionality is exposed through

TL;DR: If you are willing to sacrifice a bit of RAM, you can get decent speedups with =unfoldNd= over =torch.nn.Unfold= in both the =forward= and =backward= operations.


There is a continuous benchmark comparing the forward pass (and forward-backward pass) run time and peak memory [[https://f-dangel.github.io/unfoldNd-benchmark/][here]]. The settings are:

** Hardware details

The machine running the benchmark has 32GB of RAM with components

** Results

Convolutions can be expressed as matrix-matrix multiplication between two objects; a matrix-view of the kernel and the unfolded input. The latter results from stacking all elements of the input that overlap with the kernel in one convolution step into a matrix. This perspective is sometimes helpful because it allows treating convolutions similar to linear layers.

** The trick

Extracting the input elements that overlap with the kernel can be done by a one-hot kernel of the same dimension, and using group convolutions.

** Applications

This is an incomplete list where the unfolded input may be useful:

Encountered a problem? Open an issue [[https://github.com/f-dangel/unfoldNd/issues][here]].