Average Multiple Frames

It would be nice to average the features from multiple frames. In the most basic case, this would entail:

Storing the outputs from the last N model queries into a matrix (each row is a frame, and the columns are the class outputs) 1.1 It is assumed the model outputs the class probabilities. 1.2 This should be implemented as a "ring buffer" where we overwrite the oldest row.
Compute the average output by taking an average of each column 2.1 This should be an efficient operation for a DSP library
Use the average output to do taxonomic predictions.
3.1 The prediction code should be agnostic to whether it is receiving a direct output from a model or an averaged vector.

Some psuedo code for the different steps:


var buffer_pointers : [UnsafeMutablePointer<Double>] = []

let features = classifications[task].featureValue.multiArrayValue! let num_features = features.count let buffer_pointer = buffer_pointers[task]

buffer_pointer.advanced(by: self.buffer_write_index * num_features).assign(from: UnsafePointer(OpaquePointer(features.dataPointer)), count: num_features)

2.

let num_features = classifications[task].featureValue.multiArrayValue!.count let buffer_pointer = buffer_pointers[task]

let stride : vDSP_Stride = num_features let avg = UnsafeMutablePointer.allocate(capacity:num_features) let length : vDSP_Length = UInt(buffer_size)

for i in 0..<num_features{ vDSP_meanvD(buffer_pointer.advanced(by: i), stride, avg.advanced(by: i), length) }

3.

self.drawVisionRequestResults2(avg)



More sophisticated methods can be conceived where we keep a few different ring buffers, each buffer is responsible for a different time length: ~1 sec, ~3 sec, ~5 sec. And results are compared between each. 

Perhaps an app should be able to request these buffers to be cleared. For example when a user does rapid movements (perhaps signifying they are done recognizing something or they are trying to recognize something different).

inaturalist / SeekReactNative

Average Multiple Frames #615