suhwan-cho / TMO

[WACV 2023] Treating Motion as Option to Reduce Motion Dependency in Unsupervised Video Object Segmentation
MIT License
56 stars 3 forks source link

some questions about output selection #11

Closed zp19990818 closed 1 year ago

zp19990818 commented 1 year ago

Hello,

I'm coming back to this. I have two questions about output selection:

  1. I trained as well as tested on the new TMO without using output selection and the metrics show up the same as the previous version of TMO, but the visualized binary map has significant edge jaggedness. I'm curious as to why the visualization results differ, but I'm calculating the same metrics. The presence of edge jaggedness should be lower for miou. The new TMO predicted binary plot is below. (Is it because of parameter B? Is the output soft score?) 7398_00021 image

  2. On my own dataset, performance drops significantly using output selection. I looked at the code and the purpose of output selection seems to be to compare the percentage of more defined pixels in the saliency map (non-blurred areas). Is this somewhat inappropriate for confidence calculations? For example, structural similarity is not taken into account. The parameter B, here, is meant to be binary, so should the output selection be done when calculating the final score?

Looking forward to your reply~

suhwan-cho commented 1 year ago

Hi,

  1. TMO handles each video frame after converting it to 384p resolution and resize the output back to the original resolution. If you want to preserve sharp edges, you can just skip the resizing processes.

  2. B indicates the batch size. It is used to identify if TMO is in the training mode or in the testing mode. The threshold value for adaptive output selection is defined as h.

zp19990818 commented 1 year ago

Hi,

  1. TMO handles each video frame after converting it to 384p resolution and resize the output back to the original resolution. If you want to preserve sharp edges, you can just skip the resizing processes.
  2. B indicates the batch size. It is used to identify if TMO is in the training mode or in the testing mode. The threshold value for adaptive output selection is defined as h.

Thank you, I got it!