NVlabs / timeloop

Timeloop performs modeling, mapping and code-generation for tensor algebra workloads on various accelerator architectures.
https://timeloop.csail.mit.edu/
BSD 3-Clause "New" or "Revised" License
303 stars 99 forks source link

Further optimization of arch.yaml based on the optimal mapping in timeloop-mapper.stats.txt #245

Closed xpww closed 4 months ago

xpww commented 4 months ago

Hello! I was recently studying the output files in timeloop-accelergy-exercises/workspace/exercises/01_accelergy_timeloop_2020_ispass/timeloop/06-mapper-convlayer-eyeriss, and I found that the timeloop-mapper.stats.txt in ref-output is for each The optimal mapping for spad exploration in a PE is as follows:

ifmap_spad [Inputs:8 (8)]
weights_spad [ Weights:32 (32) ]
--------------------------------
| for C in [0:8)

psum_spad [Outputs:4 (4)]
--------------------------
| for M in [0:4)
| << Compute >>

For example, the storage capacity constraints of ifmap_spad in eyeriss_like.yaml are as follows:

       name: ifmap_spad
       class: smartbuffer_RF
       attributes:
         depth: 12
         width: 16
         datawidth: 8
         read_bandwidth: 2
         write_bandwidth: 2

Does this mean that according to ifmap_spad [ Inputs:8 (8) ], I only need to change the depth of ifmap_spad from 12->8 and change the width from 16->8? That is, redundant registers are removed and the resource utilization of the accelerator can become less? Thanks!