Why is dim_x set to 7 in KalmanFilter

abewley / sort

Simple, online, and realtime tracking of multiple objects in a video sequence.

GNU General Public License v3.0

3.82k stars 1.07k forks source link

Why is dim_x set to 7 in KalmanFilter #91

Open litingfeng opened 4 years ago

litingfeng commented 4 years ago

Hi,

Thanks for your great work.

I was trying to understand the parameter here in this line: self.kf = KalmanFilter(dim_x=7, dim_z=4)

From the document here, dim_x means

Number of state variables for the Kalman filter.

dim_z means

Number of of measurement inputs.

From my understanding, dim_z=4, because there are 4 measurements x,y,s,r .

But why is that dim_x=7?

Thanks,

abewley commented 4 years ago

The latent state of the Kalman filter also estimates the velocity of the object. This can be thought of as a delta applied at each time-step to update the position (x, y) and scale (s). The aspect ratio (r) is assumed to be constant. See section 3.2 of the SORT paper.

nihalsangeeth commented 2 years ago

@abewley If I may ask is there a motivation behind a constant aspect ratio assumption other than reducing complexity? Great work btw.

Thanks

wwdok commented 2 years ago

In SORT paper, it says these 7 states are : @nihalsangeeth, I think the reason is not naively to reduce complexity, but the reality is that ratio of an object should not grow linearly. The sort paper and the code both assume the motion model is linear velocity model, imaging the car dirve on the road towards you, its u, v, s can grow linearly, but the ratio can not grow linearly(or grow linearly very very slow). By the way, if the ratio grow linearly and take it into account, the state transition matrix will change to :

        self.kf.F = np.array([ 
            [1, 0, 0, 0, 1, 0, 0, 0], 
            [0, 1, 0, 0, 0, 1, 0, 0], 
            [0, 0, 1, 0, 0, 0, 1, 0], 
            [0, 0, 0, 1, 0, 0, 0, 1],
            [0, 0, 0, 0, 1, 0, 0, 0], 
            [0, 0, 0, 0, 0, 1, 0, 0],
            [0, 0, 0, 0, 0, 0, 1, 0]])

What it means behind is r = r + dr * Δt