epic-kitchens / VISOR-FrameExtraction

Code for re-extracting dense frames for VISOR
4 stars 0 forks source link

noisy interpolated masks quality? #4

Closed Jay-IPL closed 1 year ago

Jay-IPL commented 1 year ago

hi I visualized the interpolated masks' quality. There are many missing/inaccurate masks (about 50% of interpolated masks).

Question:

  1. in your paper, for the VOS task, what is the training data you used in total? did you use sparse annotated training images or interpolated training image?
  2. when evaluating VOS task, what is the evaluation data you used in total? did you calculate the metrics only on sparse annotated testing data?

thanks!

dimadamen commented 1 year ago

As specified in our paper, these dense interpolations are not used in training or evaluating the VOS model. They are filtered results of the VOS model, which weights we also release

For the VOS task, we only use the manual sparse labels for training - note that our code is already public to train the model to replicate results for that baseline which you can check at: https://github.com/epic-kitchens/VISOR-VOS You can check that repo for all details instead of this repo which focuses on frame extraction rather than the VOS benchmark.

Obviously, we only use manually labelled data to evaluate. Note that in the unreleased test set we have dense manual masks which are not released but used in evaluating the model for the test set.

Jay-IPL commented 1 year ago

thanks for the clarification!

I went through that repo. It seems the model is evaluated only on VISOR sparse annotated val data instead of test data right? what do you mean by 'Note that in the unreleased test set we have dense manual masks which are not released but used in evaluating the model for the test set.'?

dimadamen commented 1 year ago

We provide the code to train on "train" and evaluate on "val". This allows you to replicate our val results.

The same code can be used to train on "train+val" and evaluate on test, but as test is not released (i.e. a leaderboard will be opened but it has not yet). But the same code is used in either.

Once again, please raise your questions in the right repo so we can answer you correctly. These Qs are not related to this repo.

Also please read our paper more carefully. The answers to your Qs are available in supplemental H.3.