NVlabs / timeloop

Timeloop performs modeling, mapping and code-generation for tensor algebra workloads on various accelerator architectures.
https://timeloop.csail.mit.edu/
BSD 3-Clause "New" or "Revised" License
340 stars 104 forks source link

Deconvolution layer #67

Closed Diksha-Moolchandani closed 1 year ago

Diksha-Moolchandani commented 3 years ago

Hi,

Can the deconvolution / transposed convolution layer be modeled / handled in timeloop? Is there an example to specify this problem in yaml?

Deconvolution is essentially convolution with upsampling. The input feature maps are upsampled by inserting zeros in between original values and then the feature map is convolved with a filter.

So, my plan is to use the upsampled feature map as input and consider it as a normal convolution. Then I can either use the construct_workloads.py script from timeloop-accelergy-exercises or manually generate the yaml file.

Is this the right approach?

vmiheer commented 3 years ago

As you describe the up-sampling is done with just inserting zeros and those zeros get multiplied with kernel to produce zero. More optimal thing would be to not upscale before convolution.

http://d2l.ai/chapter_computer-vision/transposed-conv.html shows mxNet/python code for deconvolution. If we were writing it in C/C++ the more elaborate code would be:

for (w = 0; w < IW; ++w)
    for (h = 0; h < IH; ++h)
        for (r = 0; r < FW; ++r)
            for (s = 0; s < FH; ++s)
                Y[h+s][w+r] += I[h][w] * W[s][r]

This can be expressed in timeloop as below:

timeloop-model.yaml ```yaml architecture: name: Chip subtree: - local: - attributes: depth: 128 read_bandwidth: 4 width: 16 word-bits: 16 write_bandwidth: 4 class: regfile name: RegisterFile - attributes: datawidth: 16 class: intmac name: MACC name: PE0 version: 0.3 mapping: - factors: H=2 W=2 R=2 S=2 permutation: R S H W target: RegisterFile type: temporal problem: instance: H: 2 R: 2 S: 2 W: 2 shape: data-spaces: - name: Weights projection: - - - R - - - S - name: Inputs projection: - - - H - - - W - name: Outputs projection: - - - H - - S - - - W - - R read-write: true dimensions: - R - S - H - W name: CNN-Layer ```

Running this example indeed shows: MACCs = 16 Utilized capacity 4 for inputs and weights and 9 for Outputs as expected and shown in the diagram in the website.