bu-icsg / dana

Dynamically Allocated Neural Network Accelerator for the RISC-V Rocket Microprocessor in Chisel
Other
203 stars 36 forks source link

At run-time, selectively ignore "small" operations #36

Open seldridge opened 7 years ago

seldridge commented 7 years ago

This relates to some of my early thoughts about this, specifically related to gradient descent (if a derivative is going to evaluate to zero, ignore it), but both feedforward and learning transactions could potentially benefit by selectively skipping operations. According to Brandon Reagen and the Minerva work at ISCA 2016, power could be improved by at most 2x, though this includes both static pruning and run-time pruning and it is unclear of their respective contributions. This would be an interesting avenue and is generally low cost to implement.

In broad strokes without dramatic modifications to DANA: