alanlukezic / csr-dcf

Discriminative Correlation Filter with Channel and Spatial Reliability
226 stars 73 forks source link

Failure Flag #8

Open ahkarami opened 6 years ago

ahkarami commented 6 years ago

Dear @alanlukezic, Thank you for your nice work. I have question about your code. How one can inform that your tracker fails to track the target? I mean that, in some tracking algorithms we can inform when the tracker fails to track the target. For example, by using the L1-disnatce between the visual features of the target and the output of the tracker. More specifically, something like this:

if (L1-distance_between_target_bounding_box_and_tracker_output > threshold):
    print('The tracker fails to track the target')

I will really appreciate you if your answer cover your C++ implementation (i.e. both implementations (MATLAB & C++ one)).

alanlukezic commented 6 years ago

Hi, detecting failure of a tracker is part of the evaluation process since the ground-truth is in real-word problems not known in advance (except initialization frame). If your question refers to the demo_csr script, it is used just for demo running the tracker and visualization on existing sequences. I hope that this answers your question.

ahkarami commented 6 years ago

@alanlukezic, Thank you for your response. In fact, my question refers to the OpenCV-contrib implementation of the tracker in real-world scenarios. In such scenarios, as you have mentioned, we don't have the ground-truth; however, I am searching and investigating a way that the CSRT Tracker tells when fail on real-world scenarios. For example, heuristically when the learning weights changes significantly can we conclude that the appearance model of the target changes and the tracker probably fails to track the target. What's your opinion about this question, Is there any solution to address this task?

serycjon commented 6 years ago

You can try to use peak-to-sidelobe ratio (as in Bolme, David S., et al. "Visual object tracking using adaptive correlation filters." Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. IEEE, 2010.) to detect failure.

ahkarami commented 6 years ago

@serycjon, Thank you for your nice opinion. We will implement & test it.

nkhdiscovery commented 6 years ago

@serycjon, I wanted to implement what you said, but I ran into a weired thing (I work on the trackerCSRT implementation in opencv_contrib). The max_loc has to be the point in which the final response has a maximum value. It seems it is always (0,0) or (49,0)! Which is weired to me, things get much more weired when subpixel accuracy is calculated and the resulting Point2f value sometimes gets negative, and it moves the max_loc by some floating point value lower than 1, e.g (0.109760 , -0.098645).

As the algorithm works fine, and I want to calculate peak-to-sidelobe in the final response, there has to be something that I have misunderstood. I expect a point, maybe even around center of the final response, which has the maximum value in the response, something like the figure 1 of your paper.

Would you please help us to see if the almost always (0,0) or (49,0) location is right, or I am making a mistake? this value is directly extracted from opencv implementation, line 433, the variable max_loc. here is the code: https://github.com/opencv/opencv_contrib/blob/master/modules/tracking/src/trackerCSRT.cpp#L433

alanlukezic commented 6 years ago

Gaussian function is defined so that the peak is located in the top-left corner which simplifies displacement calculation. Correlation response has to be shifted if you want to calculate PSR score (for more information about shifting, see Matlab function fftshift).

Survial53 commented 6 years ago

Hello guys, we solved this problem in the following manner. One of OpenCV trackers (MOSSE tracker) has failure detection mechanism (description in official paper Visual Object Tracking using Adaptive Correlation Filters, point 3.5 Failure Detection and PSR). But MOSSE tracker from opencv slightly different from the official paper. In paper they exclude window 11 × 11 around peak, but in OpenCV MOSSE implementation this is not done. We did this as follows (modified code from OpenCV CSR-DCF tracker):

cv::Point2f TrackerCSRDCF::estimate_new_position(const cv::Mat &image)
{
   cv::Mat resp = calculate_response(image, csr_filter);
   double maxVal;
   сv::Point max_loc;
   cv::minMaxLoc(resp, NULL, &maxVal, NULL, &max_loc);
   // PSR
   cv::Scalar mean, std;
   cv::meanStdDev(resp, mean, std);
   mPSR = (float)((maxVal - mean[0]) / (std[0] + 0.00001f));
   // another code 
   ...
}
bool TrackerCSRDCF::updateImpl(const cv::Mat &image_, cv::Rect2d &boundingBox)
{
   // treat gray image as color image
   cv::Mat image;
   if (image_.channels() == 1) 
   {
      std::vector<cv::Mat> channels(3);
      channels[0] = channels[1] = channels[2] = image_;
      merge(channels, image);
   }
   else 
   {
      image = image_;
   }

   object_center = estimate_new_position(image);
   if (mPSR < trackerParams.mPSRThreshold) 
   {
      object_center.x = (float)(boundingBox.x + boundingBox.width / 2.0);
      object_center.y = (float)(boundingBox.y + boundingBox.height / 2.0);
      return false;
   }
   // another code
   ...
}
Params::Params()
{
   // another params
   ...
   mPSRThreshold = 8.0f;
}

We found that the PSR value for CSR-DCF is in the range (4.0 - 20.0). Approximate value that we choose for mPSRThreshold is 8.0f. Failure detection work pretty good as it is now. Maybe you can to improve failure detection quality, If you will use circular shift and exclude window around peak, but we have not try that.

SmitSheth commented 6 years ago

@Survial53 in implementation of MOSSE tracker the have passed mask as well in meanStdDev function. meanStdDev(correlation_mat,mean,stddev,PSR_mask); //Compute matrix mean and std Why haven't you used it.