Closed wilsonkl closed 5 years ago
Hi Chris, Thanks for the comments! Let's talk profiling and efficiency first, and then I'll make changes once we know where we're going.
I don't have profiling broken down by individual changes, but I ran a stack of benchmarks before and after all the changes. In every experiment I saw improved accuracy and faster runtimes. Here are some plots.
Accuracy: I ran three solvers on a large Gnm random graph and plotted the cumulative histogram. Here's before the patch:
And after:
Note that these are two separate instances of Gnm, but the scale is big enough that I think the difference is negligible. (Sorry for the y-axis ticks. They should run from 0 to 0.25 in the right-hand plots. Oops. Also, sorry for the change in line colors. And sorry that the plots aren't particularly color-blind accessible, if that affects anyone reading. They weren't meant to be paper-ready figs!)
Runtime: Again, the experiment is on Gnm random graphs, across a range of problem sizes. This time each dot is an average of 10 runs. I looked manually for outlier times at the larger scales (I instinctively distrust averages), but found that the times were pretty stable.
Before patch:
After patch:
Basically, the patch is a ~2x speedup on the problems I looked at.
Is this evidence enough, or would you like to see the element-wise vs edge-wise weights broken out separately from the convergence criteria changes? I bet we could get some more speed by opening up the Eigen optimizations, but that might hurt accuracy.
Edge vs Element Weights I see the potential performance benefit from element weights, but I do like the edge weights from a modeling perspective. They correspond to a standard robustified non-linear least squares problem, whereas the element weights aren't the solution to any cleanly-modeled NLLS system. More importantly, edge weights model the fact that the 3*m residuals are partially correlated is at least one very specific way: if an edge has a very high residual in one tangent space direction, then the whole edge is untrustworthy. We can disregard/downweight all the tangent space directions for that edge. A situation where an edge was deemed trustworthy in one direction but not in another seems odd to me.
-Kyle
Woah -- that was way more evaluation than I was expecting! Much appreciated. This is neat that we get such a large speedup. Is this largely due to better convergence via edge-wise weighting? That's pretty neat if so!
Things looks good here otherwise, happy to merge once we get the small suggestions fixed.
Great! I had the plots sitting around already. I had been benchmarking as I set my environment up.
The speedup is either due to the termination criteria (maybe fewer iterations?) or the edge-wise weights or both. If it's the weights then I guess the mechanism would be a better conditioned problem, or maybe a less non-linear problem? (Those aren't quite the same thing.)
Bugfixes to Chatterjee and Govindu's robust rotation solver to match the paper.
Also, a little refactoring:
Change-Id: I139144e7f0d043891d328b3651d05a8c5bd25c5a