VisualComputingInstitute / mots_tools

Tools for evaluating and visualizing results for the Multi Object Tracking and Segmentation (MOTS) task
MIT License
222 stars 45 forks source link

Aggregated evaluation over all classes #6

Closed anirudh-chakravarthy closed 4 years ago

anirudh-chakravarthy commented 4 years ago

Hi,

Thank you for your implementation for TrackR-CNN and prompt responses to my queries. As @pvoigtlaender may recollect, I was trying to train TrackR-CNN on YT-VIS.

I came across several papers which provide an aggregated MOTSA, sMOTSA, etc. As per the implementation, evaluate_class and compute_MOTS_metrics seem to be written for a single class evaluation through all sequences.

How are aggregated metrics computed then? How should I go about this? YT-VIS has 40 classes, so class-wise reporting may not be the best idea for me.

Looking forward to hearing from you!

pvoigtlaender commented 4 years ago

Hi,

I think VIS uses different evaluation measures (based on AP). For MOTS we only considered 2 classes, so we got a separate score for each of them and for the competition, we then took the average of the 2 scores. If you want to use (s)MOTSA for VIS, then you could do it per class and afterwards average over classes. But I think this would be quite inconsistent to what has been done before for VIS.

anirudh-chakravarthy commented 4 years ago

Hi,

Thank you so much for your prompt response!

Computing AP etc metrics are much tougher since for each attempt, the inference file needs to be submitted to their CodaLab server. Hence, we'd thought of working with MOTS based metrics. We happened to see some degree of interchangeability i.e, one or two VIS papers at CVPR report on sMOTSA and other such metrics.

So, just to clarify, just a simple unweighted average is performed for your competition right?

pvoigtlaender commented 4 years ago

Yes unweighted: for KITTI MOTS we used 0.5 sMOTSA_car + 0.5 sMOTSA_pedestrian as the total score

pvoigtlaender commented 4 years ago

In case you remember that: can you please tell me, which papers used sMOTSA for VIS?

anirudh-chakravarthy commented 4 years ago

I see! Thank you very much!

I remember one paper: "Video Instance Segmentation Tracking with a Modified VAE Architecture"- pdf. You could take a look at Table 1.

While they call it Video Instance Segmentation Tracking, but it seems extremely similar to VIS performed using Tracking measures!