Closed chrisdembia closed 5 years ago
Hm, an interesting problem. Unsure how exactly to get that trade-off. Any thoughts on making a few special cases not depend on the slow pow()
method? For instance, allow l1 or l2 penalties to not use pow()
? Otherwise use pow()
?
Yes I'd be fine with that. I guess mostly I just don't want a performance penalty for l2.
@chrisdembia Should I just work directly with the branch in progress?
Yes, that'd be great.
Tested some cases to try to narrow down where the time lost might have gone. Used the same test as you did (commenting out second test in exampleMocoTrack
).
Using MocoControlGoal_divide_by_displacement
branch:
m_weights[iweight] * pow(abs(controls[icontrol]), m_exponent);
291 +/- 10.49 secondsm_weights[iweight] * pow(abs(controls[icontrol]), 2);
289.4 +/- 6.62 secondsm_weights[iweight] * pow(controls[icontrol], 2);
292 +/- 12.49 secondsm_weights[iweight] * controls[icontrol] * controls[icontrol];
274.4 +/- 2.61 secondsUsing master
branch:
270.8 +/- 9.73
Seems to indicate that a separate case for multiplying controls for an exact power of 2 calculation might speed things up, but I definitely didn't see quite the performance change that you did (<10% in my case, whereas yours was closer to 25%)
That's quite thorough. Hmm, weird. Maybe I was doing something wrong.
Did some testing on a more recent processor (but still Windows 10, using Visual Studio 2017). Seems to show even smaller differences in branches:
master
branch:
165.4 +/- 1.52 sec
MocoControlGoal_divide_by_displacement
branch:
168.2 +/- 4.76 sec
MocoControlGoal_divide_by_displacement
branch with explicit multiply (instead of pow):
167.4 +/- 2.70 sec
On a separate note, one reason for different testing times between computers (aside from CPU resources) could be due to small differences in solving the problem. For instance, the first computer above converged on iteration 259, while the second computer converged after 389 iterations.
Darn. Well, I guess this is a good thing?
Interesting. More iterations but shorter runtime.
I found the issue. On my Mac, pow(x, 2)
gives a different result than x * x
, causing the optimizer to take more iterations for some problems, and less in others. This is likely a platform-dependent issue.
Wow that's annoying. Sounds like x * x should be a separate branched case then, given how often it's used?
Yes, it is. Yes, I made that change on the branch.
The branch
MocoControlGoal_divide_by_displacement
allows users to set the exponent on the controls, and also allows users to divide by the model's displacement over the phase.These changes are causing the first problem in
exampleMocoTrack
to increase its runtime from 89 seconds to 120 seconds. This is too large of an increase!I suspect the issue is using
pow()
and allowing the exponent to be set at runtime.We should find a way to have a generic exponent without the worse performance.
@carmichaelong , are you interested?