fastmachinelearning / hls4ml-tutorial

Tutorial notebooks for hls4ml
http://fastmachinelearning.org/hls4ml-tutorial/
274 stars 123 forks source link

Resource vs Latency strategy #40

Open vandenBergArthur opened 1 year ago

vandenBergArthur commented 1 year ago

Hi all, I am wondering what the difference is between the Resource and the Latency optimization strategy. Because I think my grasp of these concepts is wrong. Initially, I thought that the Latency strategy is used to obtain the lowest latency possible, at the expense of resource usage (use more resources to obtain the lowest latency possible). To fully parallelize the model, you need a ReuseFactor = 1, and the amount of parameters in a single layer can't exceed the vivado unroll limit of 4096 (right?). But is it also possible to use a RF > 1 when using the Latency strategy? What would be the result? On the other hand, I would think that selecting the Resource strategy would imply using the least amount of resources, at the expense of latency (use less resources, but have a higher latency).

But now comes the problem, in tutorial 7 a model is deployed on a PYNQ-Z2 board using the VivadoAccelerator backend. The strategy is not explicitly set, so by default it uses Latency. RF is also set to 64. Now to test, I changed the strategy to Resource using the following code:

for layer in ['fc1', 'fc2', 'fc3', 'output']:
    config['LayerName'][layer]['Strategy'] = 'Resource'
    config['LayerName'][layer]['ReuseFactor'] = 64

But at my surprise, it uses MORE resources than the original build (which used the Latency strategy):

================================================================
== Utilization Estimates
================================================================
* Summary: 
+-----------------+---------+-------+--------+-------+-----+
|       Name      | BRAM_18K| DSP48E|   FF   |  LUT  | URAM|
+-----------------+---------+-------+--------+-------+-----+
|DSP              |        -|      -|       -|      -|    -|
|Expression       |        -|      -|      40|   5483|    -|
|FIFO             |        -|      -|       -|      -|    -|
|Instance         |       16|     21|   17842|  41920|    -|
|Memory           |        -|      -|       -|      -|    -|
|Multiplexer      |        -|      -|       -|    128|    -|
|Register         |        0|      -|    2485|    352|    -|
+-----------------+---------+-------+--------+-------+-----+
|Total            |       16|     21|   20367|  47883|    0|
+-----------------+---------+-------+--------+-------+-----+
|Available        |      280|    220|  106400|  53200|    0|
+-----------------+---------+-------+--------+-------+-----+
|Utilization (%)  |        5|      9|      19|     90|    0|
+-----------------+---------+-------+--------+-------+-----+

Could someone please shed some light onto this? Are my intuitions of these concepts wrong?