jeanluct / braidlab

Matlab package for analyzing data using braids
GNU General Public License v3.0
23 stars 9 forks source link

Speed up loop by using array of coords #71

Closed jeanluct closed 9 years ago

jeanluct commented 9 years ago

This arose from issue #68.

After revisiting some old tests, I think multithreaded support is premature. The reason, which is clear after running loop_speedtest.m in devel/tests/loop_speedtest (simple MEX compilation required), is that the huge slowdown right now is not limited by the C/C++ function but by Matlab. Here's the output:

#!matlab

>> loop_speedtest
Number of loops = 531441

action of braid on loops (vectors)...  (0.113561 seconds)

converting to array of loops...  (5.603251 seconds)

action of braid on array of loops (classes)...  (6.167722 seconds)

verify equality...  (1.500239 seconds)

Slowdown = 103.653305

The difference is that the first case is just a set of loops represented by a plain array (no loop class). The C function acts on this directly.

Then this is converted to loop objects, which takes a lot of time. The action on loops takes even more time.

So we need to solve this problem first. There's a few options:

I'd like to try the last option first. Let's see if it's feasible...

jeanluct commented 9 years ago

From Jean-Luc Thiffeault on 2014-10-08 23:49:30+00:00

Ok, some progress being made. As of 1672c2d, the loop.coords can be a matrix. We can access a single row (a single loop) using index notation:

#!matlab

ll = loop ([ -1 1 -2 0; 1 -2 3 4])

ll = 

   (( -1  1 -2  0 )) 
   ((  1 -2  3  4 ))

>> ll(2)

ans = 

   (( 1 -2  3  4 ))

This required overloading () via subsref and subsasgn.

So far so good. I guess the idea would be to completely forbid the creation of arrays of loops? Now they can't be created with l(2,3)=loop since the overloaded () rejects this. If I rewrite the zeros functions that will take out another way of creating them. The only thing left will be the awkward concatentation:

#!matlab

>> l = loop;
>> l2 = [l;l]

l2 = 

   (( 0 -1 ))

   (( 0 -1 ))

That's kind of annoying but if we make it such that the loop methods don't work on such things, then there will be no reason to do it.

Should streamline printing of loops as well.

jeanluct commented 9 years ago

From Jean-Luc Thiffeault on 2014-10-09 10:53:33+00:00

If more convincing is needed that this was worth doing, here's the updated output of the speed test above:

#!matlab

>> loop_speedtest
Number of loops = 531441

Action of braid on loops (vectors)...  (0.110454 seconds)

Converting to array of loops...  (0.000700 seconds)

Action of braid on array of loops (classes)...  (0.164560 seconds)

Verify equality...  (0.036592 seconds)

Slowdown = 1.496188

Also importantly, Matlab no longer starts swapping like crazy when lots of loops are involved (i.e., millions). Possibly that was happening because of the overhead of having individual loop objects? Goes to show how inefficient Matlab is in its OOP facilities...

(This test is with 90f7919, which hasn't been pushed yet.)

jeanluct commented 9 years ago

From Jean-Luc Thiffeault on 2014-10-09 10:58:35+00:00

Still a few issues remaining. In test_loop_alloc I have the line l4(N) = loop(n) to allocate N loops, but it doesn't work unless I do this:

#!matlab

l4 = loop(n); % Have to do this first.  Is this a bug?
l4(N) = loop(n);

Not sure why we need to create the loop first. I think I know how to change this, but I'm not sure yet I want to...

jeanluct commented 9 years ago

From Jean-Luc Thiffeault on 2014-10-09 11:45:48+00:00

Disallow action of braid on nonscalar loop array. See issue #71.

→ <>

jeanluct commented 9 years ago

From Jean-Luc Thiffeault on 2014-10-09 11:45:48+00:00

Get rid of zeros functions in loop. Discourage use of arrays of loops. See issue #71.

→ <>

jeanluct commented 9 years ago

From Jean-Luc Thiffeault on 2014-10-09 13:39:12+00:00

The problem raised above is that:

#!matlab

>> l(2) = loop(3)

l = 

    coords: [2x4 double]

returns a struct with a coords field, not a loop!

jeanluct commented 9 years ago

From Jean-Luc Thiffeault on 2014-10-09 15:05:17+00:00

loop subscripting allows assignment and returns proper object. See issue #71.

→ <<cset 77209a975727>>

jeanluct commented 9 years ago

From Marko Budisic on 2014-10-09 15:07:44+00:00

I'm coding my minlength to support loop coordinates as a matrix of NLoops x NCoordinates. This is not a problem, but C++ would prefer to have to deal with NCoordinates x NLoops calculation, as Matlab stores columns contiguously.

This might be too large of a thing to deal with right now, so I'm working with NLoops x NCoordinates. If anything changes, please let me know.

jeanluct commented 9 years ago

From Jean-Luc Thiffeault on 2014-10-09 15:23:36+00:00

Sounds good. I definitely want to leave it as Nloops x Ncoords. It would be a real pain to change.

jeanluct commented 9 years ago

From Marko Budisic on 2014-10-09 15:41:57+00:00

OK, it's done. A bit cludgy - before passing the coords matrix to minlength, loop.m transposes it. This doesn't seem to be a huge hit on performance, but it could be a huge hit on memory. E.g., 200,000 loops with 2000 strands caused an Out of Memory error on my end. I'd prefer not to have to re-code minlength on the C++ level to allow for matrix-based coordinates if at all possible, but if Michael runs into Out of Memory issues here, let me know, and I'll do it.

jeanluct commented 9 years ago

From Jean-Luc Thiffeault on 2014-10-10 01:37:57+00:00

That's a lot of big loops... we should be ok.

jeanluct commented 9 years ago

From Jean-Luc Thiffeault on 2014-10-10 22:52:26+00:00

Ok, as of 3afc012 I'm pretty happy with the new loop subscript assignment. In particular, Michael's code modified_duffing_tutorial runs ok. From now on defects will probably be treated as new bugs.

jeanluct commented 9 years ago

From Jean-Luc Thiffeault on 2014-10-16 16:57:21+00:00

reducing is broken. See test in devel.

jeanluct commented 9 years ago

From Marko Budisic on 2014-10-21 17:57:12+00:00

Forming loop arrays by concatenation is now disallowed. See: rev 946ad03

jeanluct commented 9 years ago

From Marko Budisic on 2014-10-22 20:50:20+00:00

Vertcat of loops is now implemented. Horizontal concatenation and general cat are still not allowed. See rev 020bf9b

jeanluct commented 9 years ago

I think this can be closed.