Is this correct? Shouldn't it be x1, y1, x2, y2 = box_?

deepConnectionism commented 4 months ago

https://github.com/yutongwangBIT/VOOM/blob/c8a9b1bf7bf6f71530d103e63646452286dfcfb3/PythonScripts/generate_detection_files.py#L52C9-L53C59

Didn't Yolo print the box in the upper left corner and the lower right corner? Is this wrong?

https://github.com/yutongwangBIT/VOOM/blob/c8a9b1bf7bf6f71530d103e63646452286dfcfb3/PythonScripts/generate_detection_files.py#L22

If the resolution is not Yolo's resolution, would you like to read:

mask_box = cv2.resize(mask_box, image_size)
mask_box = (mask_box * 255).astype(np.uint8)

yutongwangBIT commented 4 months ago

Yes, the symbol order in my documentation may differ from YOLO's, but I've ensured that the usage in the C++ implementation is consistent and correct.

Regarding the image resolution, yes, resize is ok.

deepConnectionism commented 4 months ago

I've ensured that the usage in the C++ implementation is consistent and correct.？

Can you point out where it was implemented? If I want to produce new data and use your code, then I should be x1, y1, x2, y2 = box_? Or, as you did?

deepConnectionism commented 4 months ago

And I used my own data to generate the BBOX and mask using Yolov8, running the code would result in an error:

Here's what I got using Kitti data, Yolov8:

This is the mistake I make when visualizing:

mgproc/src/drawing.cpp:1953: error: (-215:Assertion failed) axes.width >= 0 && axes.height >= 0 && thickness <= MAX_THICKNESS && 0 <= shift && shift <= XY_SHIFT in function 'ellipse'

And sometimes it shows up, but obviously, the ellipse is not in the right place：

yutongwangBIT commented 4 months ago

Our algorithm's testing has primarily been conducted in indoor environments, and it hasn't been specifically tailored for dynamic outdoor scenes. If the algorithm were to be applied outdoors, an additional step of object tracking would be necessary to ensure robustness. This is likely why you've observed inaccuracies in the ellipsoid estimation.

Additionally, the ellipsoid estimation failures you mentioned, led to the error `mgproc/src/drawing.cpp:1953: error: (-215:Assertion failed) axes.width >= 0 & axes.height >= 0 & thickness <= MAX_THICKNESS & 0 <= shift & shift <= XY_SHIFT' in function 'ellipse', occur because the estimation process sometimes results in non-value or negative values. These invalid inputs are not compatible with the parameters expected by the ellipse function, which requires positive values for axe dimensions and other parameters. Adjusting the algorithm to filter or correct these values before drawing operations could mitigate such issues.

yutongwangBIT commented 4 months ago

I also suggest you change the following for outdoor experiments:

https://github.com/yutongwangBIT/VOOM/blob/c8a9b1bf7bf6f71530d103e63646452286dfcfb3/src/Object.cc#L112

from:

if(observed_kfs.size() > 2 && observed_kfs.size() % 1 == 0 && observed_kfs.size()<30){
                //OptimizeReconstruction(false);
                OptimizeReconstructionQuat(true);
                flag_optimized = true;
            }
        }
        //if(!flag_optimized && bboxes_.size()>4){
        //    OptimizeReconstruction(false);
        //}

to:

if(observed_kfs.size() > 2 && observed_kfs.size() % 1 == 0 && observed_kfs.size()<30){
    OptimizeReconstructionQuat(true);
}

if(observed_kfs.size()>=30)
    flag_optimized = true;
}

if(!flag_optimized && bboxes_.size()>4){
    OptimizeReconstruction(false);
}

AND ADD at https://github.com/yutongwangBIT/VOOM/blob/c8a9b1bf7bf6f71530d103e63646452286dfcfb3/src/Tracking.cc#L842

else{
    Matrix34d Rt = cvToEigenMatrix<double, float, 3, 4>(mCurrentFrame.mTcw);
    Matrix34d P;
    P = K_ * Rt;

    for(auto [node_id, attribute] : mCurrentFrame.graph->attributes){
        auto bb_det = attribute.bbox;
        if(attribute.obj){
            auto proj = attribute.obj->GetEllipsoid().project(P);
            auto bb_proj = proj.ComputeBbox();
            double iou = bboxes_iou(bb_proj, bb_det);
            auto c3d = attribute.obj->GetEllipsoid().GetCenter();
            float z = Rt.row(2).dot(c3d.homogeneous());
            if(iou>0.01 && abs(z-current_depth_data_per_det_[node_id].first)<3.0){
            auto c = proj.GetCenter();
            auto axes = proj.GetAxes();
            double angle = proj.GetAngle();
            attribute.obj->AddDetection(attribute.label, bb_det, attribute.ell, attribute.confidence, Rt, mCurrentFrame.mnId, kf);
       }
    }
}

REASON: As observed in indoor experiments, due to the low speed, it is not necessary to use detections on all frames. However, in outdoor experiments, relying solely on keyframe optimization is insufficient

deepConnectionism commented 4 months ago

Thanks for your advice, I will try it

yutongwangBIT / VOOM

Is this correct? Shouldn't it be x1, y1, x2, y2 = box_? #5