Closed mihaibujanca closed 7 years ago
My bad, the issue there was that I had forgotten to assign the parameters. Now there are multiple iterations but the cost is still not changing. Attached log
See the last comment here: #87
There's an open bug when using solely graph energies. Use the hack there for now, I should have it fixed this week.
Oh, I missed that one!
Still a little bit unsure of how to use this. as I need to have everything in the same domain.
This is my current code:
local G = Graph("DataG", 7,
"v", {N}, 8,
"n0", {D}, 9,
"n1", {D}, 10,
"n2", {D}, 11,
"n3", {D}, 12,
"n4", {D}, 13,
"n5", {D}, 14,
"n6", {D}, 15,
"n7", {D}, 16)
weightedTranslation = 0
nodes = {0,1,2,3,4,5,6,7}
for _,i in ipairs(nodes) do
weightedTranslation = weightedTranslation + Weights(G.v)(i) * TranslationDeform(G["n"..i])
end
local cost = LiveVertices(G.v) - CanonicalVertices(G.v) + weightedTranslation
local zeroIm = ComputedImage("zero",{N}, 0.0)
Energy(zeroIm(0) * cost)
Which results in residual contains image reads from multiple domains
.
Not exactly sure how to correlate the sumOfParams
in the other issue with what I have
The hack cost should be a separate residual term that shouldn't use graphs at all.
On Mon, Oct 23, 2017 at 18:12 Mihai notifications@github.com wrote:
Oh, I missed that one!
Still a little bit unsure of how to use this. as I need to have everything on one domain.
This is my current code:
local G = Graph("DataG", 7, "v", {N}, 8, "n0", {D}, 9, "n1 https://maps.google.com/?q=9,%0D+%22n1&entry=gmail&source=g", {D}, 10, "n2", {D}, 11, "n3", {D}, 12, "n4 https://maps.google.com/?q=12,%0D+%22n4&entry=gmail&source=g", {D}, 13, "n5 https://maps.google.com/?q=13,%0D+%22n5&entry=gmail&source=g", {D}, 14, "n6", {D}, 15, "n7", {D}, 16)
weightedTranslation = 0
nodes = {0,1,2,3,4,5,6,7}
for _,i in ipairs(nodes) do weightedTranslation = weightedTranslation + Weights(G.v)(i) * TranslationDeform(G["n"..i]) end
local cost = LiveVertices(G.v) - CanonicalVertices(G.v) + weightedTranslation local zeroIm = ComputedImage("zero",{N}, 0.0)
Energy(zeroIm(0) * cost)
Which results in residual contains image reads from multiple domains.
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/niessner/Opt/issues/100#issuecomment-338841390, or mute the thread https://github.com/notifications/unsubscribe-auth/ACww7XkofA29BibeS---kbec_q14_v6Qks5svTl6gaJpZM4QDpfm .
Oh alright, now it seems to be working (or at least getting to a low cost, I still need to test if the values are correct), however lots of iterations result in 0 cost change, and breaking at iteration x
, usually < 20 iterations. Is there any reason that might often happen?
breaking at iteration x
happens if the linear system converges quickly. 0 cost change and reverting is a natural part of highly nonlinear solves when using Levenburg-Marquadt. If the first several nonlinear iterations all revert, you could set the initial trust_region_radius to a lower value to skip that part of the solve.
Covered by #91
Not sure if I should reopen or open a different issue for this since what happens now may or may not be related - I am now trying to test my program on a proper dataset (it worked correctly on handmade data).
For some reason the cost drops quickly but then stays at a reasonably large value (cost starts at 18376534 and drops to 190 and stays there).
I understand, of course this would be very hard to debug remotely but any pointers would be appreciated. Worth noting that on the same data, Ceres was converging with a cost of 1e-8, and I tried to reproduce the cost function as well as I could.
Here's the first iteration. Subsequent iterations stay at the same cost.
//////////// ITERATION0 (Opt(LM)) ///////////////
zeta=-0.000101857571280561388, breaking at iteration: 4
cost=18376534.000000
model_cost=48635.292969
model_cost_change=18327898.000000
zeta=-5.81109043196192943e-06, breaking at iteration: 24
cost=48635.277344
model_cost=234.928070
model_cost_change=48400.347656
cost=234.928040
model_cost=193.840088
model_cost_change=41.087952
cost=193.840103
model_cost=191.666962
model_cost_change=2.173141
cost=191.666870
model_cost=190.530045
model_cost_change=1.136826
zeta=-0.00317946262657642365, breaking at iteration: 50
cost=190.530014
model_cost=190.435852
model_cost_change=0.094162
zeta=-0.00204528076574206352, breaking at iteration: 40
cost=190.435913
model_cost=190.403366
model_cost_change=0.032547
zeta=-0.00184173777233809233, breaking at iteration: 170
cost=190.403275
model_cost=190.334518
model_cost_change=0.068756
cost=190.334457
model_cost=190.279221
model_cost_change=0.055237
zeta=-0.0014018454821780324, breaking at iteration: 100
cost=190.279312
model_cost=190.268097
model_cost_change=0.011215
zeta=-0.000233855316764675081, breaking at iteration: 60
cost=190.268112
model_cost=190.264236
model_cost_change=0.003876
cost=190.264175
model_cost=190.254593
model_cost_change=0.009583
zeta=-0.0033028090838342905, breaking at iteration: 50
cost=190.254654
model_cost=190.252869
model_cost_change=0.001785
cost=190.252716
model_cost=190.251801
model_cost_change=0.000916
zeta=-0.00542288646101951599, breaking at iteration: 90
cost=190.251755
model_cost=190.249207
model_cost_change=0.002548
zeta=-0.0132675953209400177, breaking at iteration: 92
cost=190.249298
model_cost=190.249191
model_cost_change=0.000107
Oh, worth mentioning the Ceres version was using SPARSE_SCHUR
, this is obviously using LM
First thing to try is using double precision instead of single precision to see if it is a precision issue. I don't know if your example or data is sensitive, but I also don't mind fiddling with things to see what the issue is.
Warning: if your GPU is before the Pascal generation (10XXs) Opt must use slower software implementations of double-precision floating point atomics so you may see a significant slowdown.
I am already using double precision so that wouldn't be the problem, but that's good to know - I only have a GeForce 960M. Floating point precision should be good enough for what I need in theory so I might change too that later
Is there any obvious reason why using GN would have NaN cost but LM would work?
GN is not guaranteed to even locally converge (and often doesn't if the Jacobian is ill conditioned). LM is.
Three sanity checks you can do is to try the result of your ceres implementation as the starting values of your Opt implementation or vice versa, or try ceres LM implementation.
I'll need to look into this in more detail. Did the test with LM on Ceres and it indeed does stay at a cost that is of the same order of magnitude (and indeed not too far, Ceres' final cost is 165, Opt's is 189). It seems like Ceres starts at a much higher cost for some reason, but converges after 3 iterations to 165. Opt gets to 191 after 4 iterations and then takes really small steps until reaching a value around 189.
Hmm, sounds likely that there is some difference in the energy function. Costs should at least match initially (and probably at the end too).
Of note, the hack energy is no longer needed on the latest version of master (see #91).
Yay! Can confirm it's working without the hack
This is probably some mistake of mine coming from not understanding everything well enough yet, but the solver seems to just compute the cost once and then finish. Apologies if this very specific to my issue and perhaps too open ended.
I might be missing something very basic in my C code (most likely), but here's the output from Opt:
Also gist with verbosityLevel set to 2..
Here's my CombinedSolver.h.