IDF1, IDR and IDP greater than 100% after post-processing tracks using re-id methods

JakeCowton commented 4 years ago

I have a bunch of objects that I am tracking using the Deep SORT algorithm, that produce decent results when evaluated using this library. However, as in all multi-object tracking applications, one ground truth identity (oid) usually ends up being assigned multiple identities in testing (hid). I am using re-identification methods to try to correct some of the mistakes made by Deep SORT

Fundamentally, in an ideal scenario, if oid 1, is assigned hids 6, 8, 32 and 35 through the time it is tracked, the re-identification system would recognise that hids 8, 32 and 35 are the same as 6 and subsequently set them all to 6 as that was the first hid that was assigned. So something that looks like this as the input

frame  id        x       y      width  height  conf
1      6       1466.00  578.00  195.00  390.00     1
2      8       1475.00    2.00  222.00  375.00     1
3      32       700.48  580.39  276.07  453.66     1
4      35       770.00    5.00  378.00   99.00     1

would be modified to look like this

frame  id        x       y      width  height  conf
1      6      1466.00  578.00  195.00  390.00     1
2      6      1475.00    2.00  222.00  375.00     1
3      6       700.48  580.39  276.07  453.66     1
4      6       770.00    5.00  378.00   99.00     1

So the only thing that is modified is the hid, everything else is left the same.

When I process the original, I get the results I would expect, however when I process the modified version, my IDF1, IDR and IDP are all above 100%.

I've traced this problem back to idfn(df, id_global_assignment) in metrics.py, which returns a negative number (which I reckon is the problem, but I'm not sure), however I can't figure out what could be causing this in relation to the input I'm providing.

cheind commented 4 years ago

thanks for your report. Would you mind creating a small unit-test case, i.e. strip your data as far as you can to recreate the issue. Then share the files, so that I could have a look.

Best, Christoph

JakeCowton commented 4 years ago

Here's the closest thing to a minimum implementation I could get.

https://drive.google.com/drive/folders/1V7KXfeXWrcIkk2RD0Lxy_E5FKv81NnGL?usp=sharing

ground_truth.csv is the ground truth tracks, original.csv is the before processing re-identification, and reconfigured.csv is after changing some IDs using re-identification.

The original tracks produce:

     IDF1   IDP    IDR   Rcll  Prcn GT MT PT ML FP FN IDs  FM  MOTA  MOTP
acc 99.7% 99.4% 100.0% 100.0% 99.4%  9  9  0  0  1  0   0   0 99.4% 0.140

and the reconfigured tracks produce:

      IDF1    IDP    IDR   Rcll  Prcn GT MT PT ML FP FN IDs  FM  MOTA  MOTP
acc 100.3% 129.3% 100.6% 100.0% 99.4%  9  9  0  0  1  0   0   0 99.4% 0.309

cheind commented 4 years ago

was IDF1, IDP, IDR much larger for the original dataset? If not, could this relate to a numerical problem?

JakeCowton commented 4 years ago

The full datasets get the following results

     IDF1   IDP   IDR  Rcll  Prcn GT MT PT ML FP  FN IDs  FM  MOTA  MOTP
acc 91.1%   88.3%  90.8% 99.1% 99.8% 73 70  2  1 53 225 910  88 95.0% 0.211 # Original
acc 101.2% 155.3% 100.9% 99.1% 99.8% 73 70  2  1 53 225 203  84 98.0% 0.691 # Reconfigured

cheind commented 4 years ago

ok, thanks. did you run agains master or develop branch (which has some fixes) ?

JakeCowton commented 4 years ago

I've been working on 1.1.3 which is what's on PyPi. I've just ran it on the develop branch and get the same exactly the same results.

cheind commented 4 years ago

thanks, I hope I can have a look at it in the coming days/weeks.

JakeCowton commented 4 years ago

I've updated the data files to much smaller versions (just two frames) that demonstrate the same problem.

       IDF1   IDP    IDR   Rcll  Prcn GT MT PT ML FP FN IDs  FM  MOTA  MOTP IDt IDa IDm
acc  94.7%  90.0% 100.0% 100.0% 90.0%  9  9  0  0  1  0   0   0 88.9% 0.135   0   0   0
acc 105.3% 142.9% 111.1% 100.0% 90.0%  9  9  0  0  1  0   0   0 88.9% 0.135   2   0   2

JakeCowton commented 4 years ago

I think it might be down to there being duplicate IDs in single frames. EDIT: It was.

cheind / py-motmetrics

IDF1, IDR and IDP greater than 100% after post-processing tracks using re-id methods #60