Deconvolution layer - Githubissues

As you describe the up-sampling is done with just inserting zeros and those zeros get multiplied with kernel to produce zero. More optimal thing would be to not upscale before convolution.

http://d2l.ai/chapter_computer-vision/transposed-conv.html shows mxNet/python code for deconvolution. If we were writing it in C/C++ the more elaborate code would be:

for (w = 0; w < IW; ++w)
    for (h = 0; h < IH; ++h)
        for (r = 0; r < FW; ++r)
            for (s = 0; s < FH; ++s)
                Y[h+s][w+r] += I[h][w] * W[s][r]

This can be expressed in timeloop as below:

timeloop-model.yaml

```yaml architecture: name: Chip subtree: - local: - attributes: depth: 128 read_bandwidth: 4 width: 16 word-bits: 16 write_bandwidth: 4 class: regfile name: RegisterFile - attributes: datawidth: 16 class: intmac name: MACC name: PE0 version: 0.3 mapping: - factors: H=2 W=2 R=2 S=2 permutation: R S H W target: RegisterFile type: temporal problem: instance: H: 2 R: 2 S: 2 W: 2 shape: data-spaces: - name: Weights projection: - - - R - - - S - name: Inputs projection: - - - H - - - W - name: Outputs projection: - - - H - - S - - - W - - R read-write: true dimensions: - R - S - H - W name: CNN-Layer ```

Running this example indeed shows: MACCs = 16 Utilized capacity 4 for inputs and weights and 9 for Outputs as expected and shown in the diagram in the website.

NVlabs / timeloop

Deconvolution layer #67