def FIR_Filter(args: Array[String]) {
val input = StreamIn[Int](target.In)
val output = StreamOut[Int](target.Out)
val weights = DRAM[Int](32)
val width = ArgIn[Int]
val P = 16 (1,1,32)
// Initialize width with the first console argument
setArg(width, min(32, args(0).to[Int]) )
// Transfer weights from the host to accelerator
sendArray(weights, loadData[Int]("weights.csv"))
Accel(*) {
val wts = RegFile[Int](32)
val ins = RegFile[Int](32)
val sum = Reg[Int]
// Load weights from DRAM into local registers
wts load weights(0::width)
// Stream continuously
Stream(*) {
// Shift in the most recent input
ins <<= input
// Create a reduce-accumulate tree with P inputs
Reduce(sum)(0 until width par P){i =>
// Multiply corresponding weight and input
wts(i) * ins(i)
}{(a,b) => a + b }
// Assign the result of computing the average
// to the output stream
output := sum / width
}
}
To make this actually correct, we need a Pipe wrapped just inside the Stream(*) because the Reduce and output := pipes will actually just run all the time no matter what. Or we need stream-aware memory structures for this kind of thing, like shift-acknowledgements and register staleness
To make this actually correct, we need a
Pipe
wrapped just inside theStream(*)
because theReduce
andoutput :=
pipes will actually just run all the time no matter what. Or we need stream-aware memory structures for this kind of thing, like shift-acknowledgements and register staleness