Closed agramfort closed 11 years ago
.... see #2406 in sklearn / FactorAnalysis code
@agramfort I can take a stab on this. If you don't beat me to do it today ;-)
no chance I'll do it before next week. Maybe you can leave it to someone more junior :)
fine with me, I was not craving for this task btw. ;-)
any volunteer?
@mainakjas @rgoj ?
any volunteer?
@mainakjas @rgoj ?
haha, I was just waiting to be pinged from this thread, I knew that was coming ;)
I'd like to focus now on the time-frequency beamforming, but I could actually use this as a bit of a break next week (to keep contributing regularly, but get a break from the hardcore stuff), so yeah, I can do it around Tuesday or Wednesday next week, unless somebody gets round to it earlier. If nobody does by next week, I'll let you know here that I'm on it when I start. So, yes, I volunteer :)
no rush
thanks for volunteering @rgoj
Interesting. For n=1e6 it is about 5x faster on my laptop :).
It's it the same speed for shorter arrays? At one point I considered using it but I thought I found it to be slower. Maybe I'm crazy though. On Sep 20, 2013 5:28 AM, "Martin Luessi" notifications@github.com wrote:
Interesting. For n=1e6 it is about 5x faster on my laptop :).
— Reply to this email directly or view it on GitHubhttps://github.com/mne-tools/mne-python/issues/773#issuecomment-24806761 .
For me it's also faster for short arrays. E.g, for n=10 it is 1.7us vs 9us.
It's it the same speed for shorter arrays? At one point I considered using it but I thought I found it to be slower. Maybe I'm crazy though.
I've tested it when refactoring the sklearn FactorAnalsysis code,
it's ridiculously fast.
Hmmm.. the downside is if X
is F_CONTIGUOUS
, it actually isn't faster, because ravel()
will reorder the array. Also, using X.flat
is about 5x slower than when using X.ravel()
for me. To make it faster all the time, we would have to use
np.dot(X.ravel(order='F' if np.isfortran(X) else 'C'), X.ravel(order='F' if np.isfortran(X) else 'C'))
of course this could be written in two lines ;)
or put it in a compute_norm function or so, in utils
On Sep 20, 2013, at 5:24 PM, Martin Luessi notifications@github.com wrote:
of course this could be written in two lines ;)
— Reply to this email directly or view it on GitHub.
To clarify:
In [50]: a = np.random.randn(100, 100)
In [51]: X = a
In [52]: %timeit np.sum(X ** 2)
10000 loops, best of 3: 27.7 us per loop
In [53]: %timeit np.dot(X.flat, X.flat)
100000 loops, best of 3: 13.3 us per loop
In [54]: %timeit np.dot(X.ravel(), X.ravel())
100000 loops, best of 3: 5.6 us per loop
In [55]: %timeit np.dot(X.ravel(order='F' if np.isfortran(X) else 'C'), X.ravel(order='F' if np.isfortran(X) else 'C'))
100000 loops, best of 3: 7.65 us per loop
In [56]: X = a.T # fortran order
In [57]: %timeit np.sum(X ** 2)
10000 loops, best of 3: 27.8 us per loop
In [58]: %timeit np.dot(X.flat, X.flat)
10000 loops, best of 3: 92.7 us per loop
In [59]: %timeit np.dot(X.ravel(), X.ravel())
10000 loops, best of 3: 26.4 us per loop
In [60]: %timeit np.dot(X.ravel(order='F' if np.isfortran(X) else 'C'), X.ravel(order='F' if np.isfortran(X) else 'C'))
100000 loops, best of 3: 7.74 us per loop
Sounds like the last variant should be put in a compute euclidean scalar norm function then .... this we could use everywhere in one line.
On Sep 20, 2013, at 5:35 PM, Martin Luessi notifications@github.com wrote:
To clarify:
In [50]: a = np.random.randn(100, 100) In [51]: X = a In [52]: %timeit np.sum(X * 2) 10000 loops, best of 3: 27.7 us per loop In [53]: %timeit np.dot(X.flat, X.flat) 100000 loops, best of 3: 13.3 us per loop In [54]: %timeit np.dot(X.ravel(), X.ravel()) 100000 loops, best of 3: 5.6 us per loop In [55]: %timeit np.dot(X.ravel(order='F' if np.isfortran(X) else 'C'), X.ravel(order='F' if np.isfortran(X) else 'C')) 100000 loops, best of 3: 7.65 us per loop In [56]: X = a.T # fortran order In [57]: %timeit np.sum(X * 2) 10000 loops, best of 3: 27.8 us per loop In [58]: %timeit np.dot(X.flat, X.flat) 10000 loops, best of 3: 92.7 us per loop In [59]: %timeit np.dot(X.ravel(), X.ravel()) 10000 loops, best of 3: 26.4 us per loop In [60]: %timeit np.dot(X.ravel(order='F' if np.isfortran(X) else 'C'), X.ravel(order='F' if np.isfortran(X) else 'C')) 100000 loops, best of 3: 7.74 us per loop — Reply to this email directly or view it on GitHub.
+1 for @dengemann's wrapping idea, that way if we discover other "quirks" we can compensate for them as well
can you also bench against linalg.norm(a, 'fro') \ 2 ?
On Fri, Sep 20, 2013 at 5:35 PM, Martin Luessi notifications@github.comwrote:
To clarify:
In [50]: a = np.random.randn(100, 100) In [51]: X = a In [52]: %timeit np.sum(X * 2) 10000 loops, best of 3: 27.7 us per loop In [53]: %timeit np.dot(X.flat, X.flat) 100000 loops, best of 3: 13.3 us per loop In [54]: %timeit np.dot(X.ravel(), X.ravel()) 100000 loops, best of 3: 5.6 us per loop In [55]: %timeit np.dot(X.ravel(order='F' if np.isfortran(X) else 'C'), X.ravel(order='F' if np.isfortran(X) else 'C')) 100000 loops, best of 3: 7.65 us per loop In [56]: X = a.T # fortran order In [57]: %timeit np.sum(X * 2) 10000 loops, best of 3: 27.8 us per loop In [58]: %timeit np.dot(X.flat, X.flat) 10000 loops, best of 3: 92.7 us per loop In [59]: %timeit np.dot(X.ravel(), X.ravel()) 10000 loops, best of 3: 26.4 us per loop In [60]: %timeit np.dot(X.ravel(order='F' if np.isfortran(X) else 'C'), X.ravel(order='F' if np.isfortran(X) else 'C')) 100000 loops, best of 3: 7.74 us per loop
— Reply to this email directly or view it on GitHubhttps://github.com/mne-tools/mne-python/issues/773#issuecomment-24819091 .
@agramfort it is terribly slow:
In [68]: X = a
In [69]: %timeit linalg.norm(X, 'fro') ** 2
10000 loops, best of 3: 81.5 us per loop
In [70]: X = a.T
In [71]: %timeit linalg.norm(X, 'fro') ** 2
10000 loops, best of 3: 93.1 us per loop
Thanks for checking
this sounds great! Sure, I could have a stab. Even I'm not sure of my schedule, but let me know @rgoj if you start so that we're not working on the same stuff at the same time :)
@agramfort suggested this as a non-senior task ;-)
On Sep 20, 2013, at 8:01 PM, mainakjas notifications@github.com wrote:
this sounds great! Sure, I could have a stab. Even I'm not sure of my schedule, but let me know @rgoj if you start so that we're not working on the same stuff at the same time :)
— Reply to this email directly or view it on GitHub.
ok, I'm on it. h-o-o-o-o-o-o-l-d on ...
sum(X \ 2) is equivalent to np.dot(X.ravel(), X.ravel()) which avoids a copy.
same trick works for mean(X \ 2)
would be nice to go over the full code base to spot these low hanging fruits.