Hi all,
I am wondering what the difference is between the Resource and the Latency optimization strategy. Because I think my grasp of these concepts is wrong.
Initially, I thought that the Latency strategy is used to obtain the lowest latency possible, at the expense of resource usage (use more resources to obtain the lowest latency possible). To fully parallelize the model, you need a ReuseFactor = 1, and the amount of parameters in a single layer can't exceed the vivado unroll limit of 4096 (right?). But is it also possible to use a RF > 1 when using the Latency strategy? What would be the result?
On the other hand, I would think that selecting the Resource strategy would imply using the least amount of resources, at the expense of latency (use less resources, but have a higher latency).
But now comes the problem, in tutorial 7 a model is deployed on a PYNQ-Z2 board using the VivadoAccelerator backend. The strategy is not explicitly set, so by default it uses Latency. RF is also set to 64. Now to test, I changed the strategy to Resource using the following code:
for layer in ['fc1', 'fc2', 'fc3', 'output']:
config['LayerName'][layer]['Strategy'] = 'Resource'
config['LayerName'][layer]['ReuseFactor'] = 64
But at my surprise, it uses MORE resources than the original build (which used the Latency strategy):
Hi all, I am wondering what the difference is between the
Resource
and theLatency
optimization strategy. Because I think my grasp of these concepts is wrong. Initially, I thought that theLatency
strategy is used to obtain the lowest latency possible, at the expense of resource usage (use more resources to obtain the lowest latency possible). To fully parallelize the model, you need aReuseFactor
= 1, and the amount of parameters in a single layer can't exceed the vivado unroll limit of 4096 (right?). But is it also possible to use a RF > 1 when using theLatency
strategy? What would be the result? On the other hand, I would think that selecting theResource
strategy would imply using the least amount of resources, at the expense of latency (use less resources, but have a higher latency).But now comes the problem, in tutorial 7 a model is deployed on a PYNQ-Z2 board using the VivadoAccelerator backend. The strategy is not explicitly set, so by default it uses
Latency
. RF is also set to 64. Now to test, I changed the strategy toResource
using the following code:But at my surprise, it uses MORE resources than the original build (which used the Latency strategy):
Could someone please shed some light onto this? Are my intuitions of these concepts wrong?