Closed nbertagnolli closed 6 years ago
Let's look in to what the speedup would actually be. Just ran a test, generating 1 billion random doubles then averaging them. On my machine it took matlab, nifty, and numpy 12, 13, and 14 seconds to generate the numbers and 0.6, 1.3, and 1.1 seconds to average them, respectively (the average in nifty was just a dumb loop). There's clearly room for a speed up, but on an absolute scale it might not be big. Also, if we did go with Accelerate, what would we do on Linux? Would it stay un-optimized or is it worth finding a more optimized implementation?
Just for kicks I wondered how using cblas would work. It doesn't directly support summing a vector but it has a dot product, so dotting a vector with a 1 vector as a sum seemed worth trying. It surprisingly takes only a little longer than the dumb loop with this approach (I guessed it would be way longer)
Migrated issue to pivotal tracker
Apple has mean calculation for vectors in Accelerate. We should use this for a potential speedup. https://developer.apple.com/reference/accelerate/1449980-vdsp_meanv?language=objc