tensil-ai / tensil

Open source machine learning accelerators
https://www.tensil.ai
Other
348 stars 28 forks source link

Fix two cycle read/write bug #50

Closed tdb-alcorn closed 2 years ago

tdb-alcorn commented 2 years ago

This change adds transparent queues to all the control queues in the accumulator modules to unblock control flow. When a control queue needs to address more than one subordinate before proceeding, we use a multi enqueue to handle this transaction. This works well when all the subordinates are engaged at the same stage in the data pipeline, but when one subordinate operates at a later stage, it can cause earlier subordinates to have to wait for it unnecessarily. By adding tiny transparent queues to all the inputs to a multi enqueue, we cut this dependency because the later subordinates control flow can simply buffer up. In the future, when we add a multi enqueue that addresses subordinates that might be separated by up to N cycles in the pipeline, we should add a transparent queue of N elements to each control input.

This change also adds a helpful command to the readme and tidies up a few comment messes.

petrohi commented 2 years ago

On FPGA:

  ResNet20 CIFAR ONNX YOLOv4 Tiny @416 ONNX ResNet50 ImageNet ONNX
A. ZCU104 baseline (ms) 13.099 294.071 514.279
B. A and external AXI4S Width Converter instead of Transmisson 11.168 262.804 473.956
C. B and two-cycle read/write fix 9.661 214.892 415.953