sabarim / STEm-Seg

This repository contains the official implementation of the paper "STEm-Seg: Spatio-temporal Embeddings for Instance Segmentation in Videos"
153 stars 23 forks source link

What is the main difference between MOTS and VIS Task? #24

Closed Cyrus1993 closed 1 year ago

Cyrus1993 commented 2 years ago

hello. I hope you are well. I know MOTS task definition is multi-object tracking and segmentation. In addition, I know (from what I understood) that the video instance segmentation task does exactly that. In both tasks, we are interested in segmenting and tracking instances in videos. So, what exactly is the difference between these two tasks?

Ali2500 commented 2 years ago

The underlying task is identical. The only difference is in the evaluation metric (sMOTSA for MOTS, mAP for YTVIS) )and the type of data (driving scenes for MOTS, youtube videos for YTVIS).

Cyrus1993 commented 2 years ago

Thank you very much for your response. I have read your article many times as well as other articles in this field. I can say that less research with this level of delicacy and detail has shed light on all the issues in this field. Thank you for publishing your research.