cheind / py-motmetrics

:bar_chart: Benchmark multiple object trackers (MOT) in Python
MIT License
1.37k stars 259 forks source link

NaN precision/recall #109

Closed cinabars closed 4 years ago

cinabars commented 4 years ago

I'm running motmetrics on some non-MOTChallenge results and observed the following edge case behavior which I think should be changed:

I suppose a result of NaN is technically correct, but in my view a NaN does not properly punish the bad performance in these two cases. NaNs could also ruin calculation of compute_many when one sequence falls under this category.

If you agree with me that this should be changed, let me know and I'll be happy to make a pull request. Otherwise, I'd like to hear your thoughts.

jvlmdr commented 4 years ago

Hey,

Yep, as you mentioned, this is technically correct and the intended behaviour, since the precision and recall are effectively undefined. I'm open to arguments for returning 0 instead of nan. However, I think one could also make an argument for returning 1 (i.e. there were 0 FP for precision, or 0 FN for recall). One advantage of returning nan is that it is more meaningful: one can see immediately what happened.

For compute_many(), I think these nans generally don't pose a problem because we compute sum(numerators) / sum(denominators) instead of mean(numerators / denominators).

cinabars commented 4 years ago

Ah sorry, I got a little mixed up. I think the result should be 1 for these cases (not nan or 0). Given:

Then we could arguably say:

So I would argue returning 1 is "correct" behavior. But if you prefer to leave it as NaN to keep it pure, I can respect that too.

cinabars commented 4 years ago

Actually yes, I think I agree with you that it should stay NaN. Nevermind, and thanks for discussing with me!

jvlmdr commented 4 years ago

No worries! Happy to discuss :)

cheind commented 4 years ago

I close this problem as solved. If not, feel free to reopen.