cheind / py-motmetrics

:bar_chart: Benchmark multiple object trackers (MOT) in Python
MIT License
1.37k stars 259 forks source link

Detector perf #110

Open Sentient07 opened 4 years ago

Sentient07 commented 4 years ago

Hello Christoph,

I have added the MODP and the MODA metric in this PR, implemented from this paper : https://catalog.ldc.upenn.edu/docs/LDC2011V03/ClearEval_Protocol_v5.pdf.

I however didn't understand the difference between MOC and MODA hence didn't implement the former.

This PR is addressing my comments in https://github.com/cheind/py-motmetrics/issues/42

cheind commented 4 years ago

Hey,

Thanks for the PR. Please ensure that the unit tests pass (they also need to cover the new metrics) to be considered for inclusion.

Thanks

Ramana Subramanyam notifications@github.com schrieb am Di., 28. Juli 2020, 12:53:

Hello Christoph,

I have added the MODP and the MODA metric in this PR, implemented from this paper : https://catalog.ldc.upenn.edu/docs/LDC2011V03/ClearEval_Protocol_v5.pdf.

I however didn't understand the difference between MOC and MODA hence didn't implement the former.

This PR is addressing my comments in #42 https://github.com/cheind/py-motmetrics/issues/42

You can view, comment on, or merge this pull request online at:

https://github.com/cheind/py-motmetrics/pull/110 Commit Summary

  • Added MODP metric
  • Added MODA
  • Rectified mistake in averaging

File Changes

Patch Links:

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cheind/py-motmetrics/pull/110, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAJNJIE3TJRGYISW3OK7ULR52U2VANCNFSM4PKMO2WA .

Sentient07 commented 4 years ago

Hi,

Thanks for your reply, I had a look at the CI failure, and I don't understand this error message. I'm unable to figure out how to go about resolving it, can you please help me understand better?

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
motmetrics/metrics.py:299: in compute_many
    partials.append(self.compute_overall(details, metrics=metrics, name=names))
motmetrics/metrics.py:231: in compute_overall
    cache[mname] = self._compute_overall(partials, mname, cache, parent='summarize')
motmetrics/metrics.py:331: in _compute_overall
    v = cache[depname] = self._compute_overall(partials, depname, cache, parent=name)
motmetrics/metrics.py:334: in _compute_overall
    return minfo['fnc_m'](partials, *vals)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
partials = [{'average_overlap': dict_items([(1, [0.3274427464711188, 0.3507247499927294, 0.0, 0.0, 0.0, 0.0]), (2, [0.41353191489...,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.]]), ...}, 'idf1': 0.6446194225721785, 'idfn': 542.0, ...}]
    def simpleAddHolder(partials):
        res = 0
        for v in partials:
>           res += v[nm]
E           TypeError: unsupported operand type(s) for +=: 'int' and 'dict_items'
motmetrics/metrics.py:746: TypeError
cheind commented 4 years ago

Sure, first try to run the test locally instead of in CI:

pytest

in the main motmetrics directory should run all tests. This requires pip install pytest and several other develop dependencies. @jvlmdr has kindly provided an additional requirements file that install all those dependencies pip install -r requirements_dev.txt.

Once that is in place, you can use pytest -k to execute a specific test. Use your favorite debugger to go from there. As far as the error itself is concerned, I am not sure what it means in detail, but I suppose there is something off with the new metrics. Once you fixed the error, ensure that you comply with our coding style. See precommit.sh.

For future reference, I guess we should provide a short contributing guideline #111

Sentient07 commented 4 years ago

Hi @cheind , I've fixed the tests and checked with the coding style and looks fine remotely.

cheind commented 4 years ago

Thanks! Did you add the metrics to the default motmetrics? Is that meaningful? @jvlmdr what's your opinion on this PR?

Sentient07 commented 4 years ago

@cheind Yep, I added MODA and MODP there. I didn't understand what _m functions are. Hence I just duplicated the original methods

JonathanSamelson commented 3 years ago

Hi, I just tried this fork and as I get a big number of FP and FN, my MODA results in numbers greater than 1, not sure it's normal according to the description of https://motchallenge.net/results/MOT17Det/ Perfect score should be 100%.

For instance here is what I get:

IDF1   IDP   IDR  Rcll  Prcn  GT  MT  PT  ML    FP    FN  IDs   FM   MOTA  MOTP   moda  modp  IDt  IDa  IDm  num_overlaps
43.04 31.93 66.02 68.91 33.32  82  30  49   3  9202  2075   12  154 -69.15 34.41 1.69 25.05    5   10    3          6662

I even got one over 200%:

IDF1   IDP   IDR  Rcll  Prcn  GT  MT  PT  ML     FP   FN  IDs   FM    MOTA  MOTP   moda  modp  IDt  IDa  IDm  num_overlaps
39.90 26.71 78.80 87.57 29.68  82  61  21   0  13858  830   54  130 -120.69 31.37 221.67 27.78    8   37    3          6626

Sorry @Sentient07 , but are you sure the formulas are correct?

Reading about MODA and MODP, I see there is the same 1- ... as with MOTA and MOTP: image

Sentient07 commented 3 years ago

Hi,

I think this is a known bug (or not). That when FPs and FNs are high as you've mentioned, MOTA can go negative and other metrics can give you absurd value. Yes, it'd be nice to have an appropriate warning in such cases.

JonathanSamelson commented 3 years ago

Sorry, please see my previous edited post. There's this 1-... that doesn't seem to be taken into account in your formula. Just like for MOTA and MODA, I believe it is used to address this issue with large FPs and FNs values. This is why it goes from [-inf, 100]. Even though it's absurd, I guess it'd be nice to have the same behaviour.

jvlmdr commented 3 years ago

Thanks for trying the code out.

While negative values are possible, MODA should not exceed 100%. Perhaps it should be normalized by num_objects instead of num_overlaps in the moda function?

jvlmdr commented 3 years ago

In general, MODA should be slightly higher than MOTA if they are computed using the same matching. This is because it is the same formula without - num_switches. (Although I think the original paper actually used a different matching to compute MODA. I'm not sure which definition we want to use here.)

jvlmdr commented 3 years ago

And @JonathanSamelson is right, there is no 1 - in the definition of moda(). Thanks for spotting this.

jvlmdr commented 3 years ago

@Sentient07 Can we use num_detections instead of num_overlaps to avoid introducing an extra metric?