2.4.9_backport : Optimizer doesn't work well

Hi, all.

I run OpenDTAM-2.4.9backport, but optimizer doesn't work well. How should I do to fix this problem?

I'd like to get high quality results like #38 (I don't know what branch he used to record #38's video, but anyway this problem is not related now). And as I watched this video, the initial image(a and d) looks similar as my results, so initial part is not problem, I think. Is it optimization parameter problem? or bug?

In my understanding, this is optimization part, right?

bool doneOptimizing; int Acount=0; int QDcount=0;
do{
    cout<<"Theta: "<< optimizer.getTheta()<<endl;
    cout<<"Acount " << Acount << ", QDcount " << QDcount << endl;
    if(Acount==0) gpause();
    a.download(ret);//GPU(a)->CPU(ret)
    pfShow("A function", ret, 0, cv::Vec2d(0, layers));
    for (int i = 0; i < 10; i++) {
        d=denoiser(a,optimizer.epsilon,optimizer.getTheta());//operater() in DepthmapDenoiseWeightedHuber.cpp
        QDcount++;
         //denoiser._qx.download(ret);
         //pfShow("Q function:x direction", ret, 0, cv::Vec2d(-1, 1));
         //denoiser._qy.download(ret);
         //pfShow("Q function:y direction", ret, 0, cv::Vec2d(-1, 1));
         d.download(ret);
         pfShow("D function", ret, 0, cv::Vec2d(0, layers));
   }
   doneOptimizing=optimizer.optimizeA(d,a);
   Acount++;        
}while(!doneOptimizing);
cout << "done optimizing!" << endl;

Here is the parameter I used(default setting).

imagesPerCV = 2
layers = 32
thetaStart = 20.0;
thetaMin = 1.0;
thetaStep = .97;
epsilon = .1;
lambda = .01;

And this is my results(after optimization), and before optimization I got same (a, d) image. And also I downloaded OpenDTAM2.4.9-backport and I don't change the code(for debug, I added some cout statement, but I think cout statement doesn't change any variables.)

screenshot from 2015-10-23 23 06 49

Probably a parameter issue. Parameters don't affect start state much, since it is just the cost volume minimum. Try setting thetastep to .99 to slow down the optimization. I forget which way it goes, but I think lambda is the coupling parameter, try to increase or decrease it until d becomes smooth. You can then tweak until you have the desired tradeoff between smooth and close fit. Epsilon is less useful, but it changes how much the result looks like L1 or L2 norm, and should probably be in the 0.01-10 range.

I tried to use the following parameters.

thetaStart =  20.0;
thetaMin   =   0.01;
thetaStep  =   0.99;
epsilon    =   0.01;
lambda     =   0.000001f;

But the results is same, that is, the windows which are named "D function" and "A function" looked always same. So, initial depth map after updating cost volume and depth map after optimization looks same.

I also tried to the following code. This code calculates the difference of "a" between current loop and last time loop.

Mat lastTimeMat = ret.clone();
bool doneOptimizing; int Acount=0; int QDcount=0;
do{
    cout<<"Theta: "<< optimizer.getTheta()<<endl;
    cout<<"Acount " << Acount << ", QDcount " << QDcount << endl;
    a.download(ret);//GPU(a)->CPU(ret)
    checkMat(ret, lastTimeMat);
    lastTimeMat = ret.clone();
    pfShow("A function", ret, 0, cv::Vec2d(0, layers));
    for (int i = 0; i < 10; i++) {
        d=denoiser(a,optimizer.epsilon,optimizer.getTheta());//operatpr() in DepthmapDenoiseWeightedHuber.cpp
        QDcount++;
        //denoiser._qx.download(ret);
        //pfShow("Q function:x direction", ret, 0, cv::Vec2d(-1, 1));
        //denoiser._qy.download(ret);
        //pfShow("Q function:y direction", ret, 0, cv::Vec2d(-1, 1));
        d.download(ret);
        pfShow("D function", ret, 0, cv::Vec2d(0, layers));
    }
    doneOptimizing=optimizer.optimizeA(d,a);
    Acount++;        
}while(!doneOptimizing);
cout << "done optimizing!" << endl;
(...)
void checkMat(Mat& now, Mat& last){
    int NumOfSame = 0, NumOfDiff = 0;
    double SumOfDiffVal = 0.0;
    for(int y = 0 ; y < now.rows ; ++y){
        for(int x = 0 ; x < now.cols ; ++x){
            if( abs( now.at<float>(y,x) - last.at<float>(y,x) )<= 1.0e-5){
                ++NumOfSame;//almost same between current "a" and last "a"
            }else{
                ++NumOfDiff;
            }
            SumOfDiffVal += abs(now.at<float>(y,x) - last.at<float>(y,x));
        }
    }
    printf("same = %d, diff = %d, diffVal = %f\n", NumOfSame, NumOfDiff, SumOfDiffVal);
}

And the results are following. /****/ Thread Start: Thread Requested: Graphics : 0xbc0c60139962564036352:0xbc0c60

text_file_name = ../../../Trajectory_30_seconds/scene_000.txt Opening: ../../../Trajectory_30_seconds/scene_000.png (...) text_file_name = ../../../Trajectory_30_seconds/scene_049.txt Opening: ../../../Trajectory_30_seconds/scene_049.png update cost (imageNum 0) update cost (imageNum 1) Theta: 20 Acount 0, QDcount 0 same = 11768, diff = 295432, diffVal = 3679041.000000 Theta: 19.8 Acount 1, QDcount 10 same = 130881, diff = 176319, diffVal = 24.262122 Theta: 19.602 Acount 2, QDcount 20 same = 305195, diff = 2005, diffVal = 0.241870 Theta: 19.406 Acount 3, QDcount 30 same = 305249, diff = 1951, diffVal = 0.240276 Theta: 19.2119 Acount 4, QDcount 40 same = 305302, diff = 1898, diffVal = 0.236795 Theta: 19.0198 Acount 5, QDcount 50 same = 305356, diff = 1844, diffVal = 0.234758 Theta: 18.8296 Acount 6, QDcount 60 same = 305418, diff = 1782, diffVal = 0.233090 Theta: 18.6413 Acount 7, QDcount 70 same = 305460, diff = 1740, diffVal = 0.231016 Theta: 18.4549 Acount 8, QDcount 80 same = 305513, diff = 1687, diffVal = 0.227877 Theta: 18.2703 Acount 9, QDcount 90 same = 305559, diff = 1641, diffVal = 0.225760 Theta: 18.0876 Acount 10, QDcount 100 same = 305618, diff = 1582, diffVal = 0.224109 (...) Theta: 0.01044 Acount 752, QDcount 7520 same = 307200, diff = 0, diffVal = 0.000016 Theta: 0.0103356 Acount 753, QDcount 7530 same = 307200, diff = 0, diffVal = 0.000000 Theta: 0.0102322 Acount 754, QDcount 7540 same = 307200, diff = 0, diffVal = 0.000428 Theta: 0.0101299 Acount 755, QDcount 7550 same = 307200, diff = 0, diffVal = 0.000000 Theta: 0.0100286 Acount 756, QDcount 7560 same = 307200, diff = 0, diffVal = 0.000023 Theta: 0.0099283 Acount 757, QDcount 7570 same = 307200, diff = 0, diffVal = 0.000004 done optimizing! Paused: Space (in GUI window) to continue /****/ [A function after optimization] a function_screenshot_27 10 2015 [D function after optimization] d function_screenshot_27 10 2015

That is, the optimizer works, but effect of optimizer is _VERY_ weak because "a" is not changed the value in each loop.

I think this is maybe the program issue, not parameter issue. I'm not sure which part has a bug, but maybe I think the bug is in optimizeA() function, especially in minimizeACaller().

Do you have any idea to solve this issue?

P.S. There are my GPU info. [CUDA Device 0] name: GeForce GTX 670 majorVersion: 3 minorVersion: 0 multiProcessorCount: 7 sharedMemPerBlock: 49152 freeMemory: 1688039424 totalMemory: 2145927168 isCompatible: 1 supports(FEATURE_SET_COMPUTE_10): 1 supports(FEATURE_SET_COMPUTE_11): 1 supports(FEATURE_SET_COMPUTE_12): 1 supports(FEATURE_SET_COMPUTE_13): 1 supports(FEATURE_SET_COMPUTE_20): 1 supports(FEATURE_SET_COMPUTE_21): 1 supports(FEATURE_SET_COMPUTE_30): 1 supports(FEATURE_SET_COMPUTE_35): 0

That's really odd, with a lambda that low, I would expect a to follow d around very closely, and d to become totally smooth. I wonder if your g weighting is messed up somehow. let me have a quick look at the code....

I have a test for messed up g values: comment out all but " _gx=1; _gy=1;" In the following code block in DepthmapDenoiseWeightedHuber.cpp:

if(!visibleLightImage.empty())
    cacheGValues();
if(!cachedG){
//         _gx.setTo(1,cvStream);
    _gx=1;
//         _gy.setTo(1,cvStream);
    _gy=1;
}

That should cause the d function to flatten out. If not, something is really wrong.

Sorry, when I watched a,d image carefully, The upper left part changed a little. These are the results. [Parameters]

int layers=32;
int imagesPerCV=2;
thetaStart =  20.0;
thetaMin   =   0.01;
thetaStep  =   0.99;
epsilon    =   0.01;
lambda     =   0.000001;

I used these code block.

Mat lastTimeMat = ret.clone();
bool doneOptimizing; int Acount=0; int QDcount=0;
do{
    cout<<"Theta: "<< optimizer.getTheta()<<endl;
cout<<"Acount " << Acount << ", QDcount " << QDcount << endl;
    a.download(ret);//GPU(a)->CPU(ret)
    checkMat(ret, lastTimeMat);
    lastTimeMat = ret.clone();
    pfShow("A function", ret, 0, cv::Vec2d(0, layers));
    for (int i = 0; i < 10; i++) {
        d=denoiser(a,optimizer.epsilon,optimizer.getTheta());//operatpr() in DepthmapDenoiseWeightedHuber.cpp
         QDcount++;
         d.download(ret);
         pfShow("D function", ret, 0, cv::Vec2d(0, layers));
    }
    doneOptimizing=optimizer.optimizeA(d,a);
    Acount++;        
 }while(!doneOptimizing);
 cout << "done optimizing!" << endl;
/***in the DepthmapDenoiseWeightedHuber.cpp ***//
if(!visibleLightImage.empty())
    cacheGValues();
if(!cachedG){
    //_gx.setTo(1,cvStream);
    _gx=1;
    // _gy.setTo(1,cvStream);
    _gy=1;
}

[BeforeOptimization]a a function_screenshot_28 10 2015

[BeforeOptimization]d d function_screenshot_28 10 2015

[AfterOptimization]a aftera function_screenshot_28 10 2015

[AfterOptimization]d afterd function_screenshot_28 10 2015

[Console]

text_file_name = ../../../Trajectory_30_seconds/scene_000.txt Opening: ../../../Trajectory_30_seconds/scene_000.png (...) text_file_name = ../../../Trajectory_30_seconds/scene_049.txt Opening: ../../../Trajectory_30_seconds/scene_049.png update cost (imageNum 0) update cost (imageNum 1) Theta: 20 Acount 0, QDcount 0 same = 11768, diff = 295432, diffVal = 3679041.000000 Theta: 19.8 Acount 1, QDcount 10 same = 130881, diff = 176319, diffVal = 24.262122 Theta: 19.602 Acount 2, QDcount 20 same = 305195, diff = 2005, diffVal = 0.241870 Theta: 19.406 Acount 3, QDcount 30 same = 305249, diff = 1951, diffVal = 0.240276 Theta: 19.2119 Acount 4, QDcount 40 same = 305302, diff = 1898, diffVal = 0.236795 Theta: 19.0198 Acount 5, QDcount 50 same = 305356, diff = 1844, diffVal = 0.234758 Theta: 18.8296 Acount 6, QDcount 60 same = 305418, diff = 1782, diffVal = 0.233090 Theta: 18.6413 Acount 7, QDcount 70 same = 305460, diff = 1740, diffVal = 0.231016 Theta: 18.4549 Acount 8, QDcount 80 same = 305513, diff = 1687, diffVal = 0.227877 Theta: 18.2703 Acount 9, QDcount 90 same = 305559, diff = 1641, diffVal = 0.225760 Theta: 18.0876 Acount 10, QDcount 100 same = 305618, diff = 1582, diffVal = 0.224109 (...) Theta: 0.0101299 Acount 755, QDcount 7550 same = 307200, diff = 0, diffVal = 0.000000 Theta: 0.0100286 Acount 756, QDcount 7560 same = 307200, diff = 0, diffVal = 0.000023 Theta: 0.0099283 Acount 757, QDcount 7570 same = 307200, diff = 0, diffVal = 0.000004 done optimizing! Paused: Space (in GUI window) to continue

But I think optimizer is very weak because when I tried to CPU version optimizer worked and got high quality result(but the quality doesn't reach a result in video which you sent me(see #38). I'd like to get high quality results like video using GPU version(such as 2.4.9 backport, experimental).

Sorry for long comments. I wrote the things which I 'd like to ask you as at the end.

I thought this issue, and I believe this issue comes from my fault. I'll explain my logic.

I used terribly strange parameters for debug, and I assumed I got completely wrong result image. But I got almost same results.Please watched them. [Parameters]

int layers=32;
int imagesPerCV=2;
thetaStart =  100.0;
thetaMin   =   0.01;
thetaStep  =   0.97;
epsilon    =   100.0;
lambda     =   12345.0;

And I used same code block as I wrote in last comment.

And these are the results. [Before Optimization]a a function_screenshot_28 10 2015

[Before Optimization]d d function_screenshot_28 10 2015

[After Optimization]a aftera function_screenshot_28 10 2015

[After Optimization]d afterd function_screenshot_28 10 2015

[Console output] text_file_name = ../../../Trajectory_30_seconds/scene_000.txt Opening: ../../../Trajectory_30_seconds/scene_000.png (...) text_file_name = ../../../Trajectory_30_seconds/scene_049.txt Opening: ../../../Trajectory_30_seconds/scene_049.png update cost (imageNum 0) update cost (imageNum 1) Theta: 100 Acount 0, QDcount 0 same = 11768, diff = 295432, diffVal = 3679041.000000 Theta: 97 Acount 1, QDcount 10 same = 17044, diff = 290156, diffVal = 63646.108175 Theta: 94.09 Acount 2, QDcount 20 same = 307200, diff = 0, diffVal = 0.000079 Theta: 91.2673 Acount 3, QDcount 30 same = 307200, diff = 0, diffVal = 0.000059 Theta: 88.5293 Acount 4, QDcount 40 same = 307200, diff = 0, diffVal = 0.000057 Theta: 85.8734 Acount 5, QDcount 50 same = 307199, diff = 1, diffVal = 0.000081 Theta: 83.2972 Acount 6, QDcount 60 same = 307199, diff = 1, diffVal = 0.000083 Theta: 80.7983 Acount 7, QDcount 70 same = 307081, diff = 119, diffVal = 0.232744 Theta: 78.3743 Acount 8, QDcount 80 same = 307199, diff = 1, diffVal = 0.000091 Theta: 76.0231 Acount 9, QDcount 90 same = 307199, diff = 1, diffVal = 0.000113 Theta: 73.7424 Acount 10, QDcount 100 same = 307200, diff = 0, diffVal = 0.000069 Theta: 71.5302 (...) Acount 300, QDcount 3000 same = 193652, diff = 113548, diffVal = 8.810418 Theta: 0.0104303 Acount 301, QDcount 3010 same = 192156, diff = 115044, diffVal = 9.047585 Theta: 0.0101174 Acount 302, QDcount 3020 same = 190717, diff = 116483, diffVal = 9.298555 Theta: 0.00981385 Acount 303, QDcount 3030 same = 188954, diff = 118246, diffVal = 9.573640 done optimizing! Paused: Space (in GUI window) to continue

As I read output in console, parameter decidedly changed(Please look initial theta value, and parameters I set). But output image looks same.

I thought this issue, and I found three possibilities.

I made a stupid misunderstanding about how to use 2.4.9 backport, or I made a mistake when I compiled OpenCV.
Parameter issue.
Program issue(that is this code has some bugs).

And I don't believe [2] because of experiment with terribly wrong parameters as I wrote. Also I don't believe [3] because many people used this code(2.4.9 backport) and there is no report like this issue we are facing now. So I believes [1], that is this issue comes from my mistake.

So I confirmed about following things.

I changed parameters in "Optimizer::setDefaultParams();" Is it correct?
Are you using C++11 features?
As I read the code, windows named "A function" and "D function" change little by little during the loop. But as I watch the windows, currently they shows always same image during optimization, and after optimization(that is, after output "done optimizing") and I pressed space, the windows shows optimized results. Why the outputs are showed after pressed space? Is my understating wrong?(I think this is related issue, so I'd like to ask here.)

I solved this issue!

This issue occurs from this code in DepthmapDenoiseWeightedHuber.cpp

#if __CUDA_ARCH__>300

And I used cuda with compute capability 3.0, so this statement becomes false, and no statement ware executed. So QD update were not executed, and optimizer became weak.

The solution for user who using cuda with compute capability 3.0, is _to comment out all "#if CUDAARCH>300" statements and "#endif" statements which correspond with #if.

In other issue( #22 ), you said

Yes, At the moment it requires cards with compute capability 3.0 or above because the kernels use warp shuffle intrinsics and bindless textures.

So, compute capability 3.0 is ok. But I think

#if __CUDA_ARCH__>300

means cuda with compute capability 3.0 doesn't satisfied this statement.

I'm not sure which is true, your code or your comment( #22 ). I believe your code because my results look good.

P.S. This problem( Why the outputs are showed after pressed space?) in last comment was also solved at the same time.Windows shows d,a in real-time. These are the results after I fixed. [Parameters] int layers=32; int imagesPerCV=2; thetaStart = 20.0; thetaMin = 1.0; thetaStep = 0.97; epsilon = 0.1; lambda = 0.01; [After optimization]a after a function_screenshot_28 10 2015

[After optimization]d afterd function_screenshot_28 10 2015

I know this is old, but you're correct. it should be >= not >, since the code no longer uses funnel shift. I may edit this when I get time.

anuranbaka / OpenDTAM

2.4.9_backport : Optimizer doesn't work well #43